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This is a Request for Filing a [X] Continuation [ ] Divisional Application under 37 CFR 1.53(b) of pending 
Application Serial No. 09/057.416 . filed on April 8. 1998 . of Olga Yurieva et ah for ENZYME DERIVED 
FROM THERMOPHILIC ORGANISMS THAT FUNCTIONS AS A CHROMOSOMAL REPLIC ASE. 
AND PREPARATION AND USES THEREOF . 

1. [X] Enclosed is a copy of the prior application, including the oath or declaration as originally 

filed and an affidavit or declaration verifying it as a true copy. 

2. [ 1 Prepare a copy of the prior application. 

3. [XJ A Filing Date as of the date of deposit in Express Mail is requested. The particulars of the 

Express Mail Deposit under 37 C.F.R. 1.10(b) are presented below. 



EXPRESS MAIL "MAILING LABEL NO." 
DATE OF DEPOSIT 



EL629424206US 
AUGUST 18. 2000 



4. a. [X] This is an application of a small entity under 37 CFR 1.9(f), and the amounts shown below in 
parentheses apply. A copy of the verified statement(s) filed in the prior application is 
enclosed. 



b-[] 



The Filing Fee is calculated below: 



FOR: 


NO. FILED 


NO. EXTRA 


RATE 


BASIC FEE 


BASIC FEE 








$690.00 


TOTAL CLAIMS 


73 -20= 


53 


X$18 ($9) 


$ 


INDER CLAIMS 


9 -3= 


6 


X$78 ($39) 


$ 


TOTAL FILING FEE 






[ ] MULTIPLE DEPENDENT CLAIM(S) PRESENTED (IF APPLICABLE) 


X$260. 


$ 


TOTAL OF ABOVE CALCULATIONS 


$ 


Reduction by 1/2 for fihng by Small Entity (Note 37 CKR 1 3, 1 27, 1 .28) If applicable, verified statement must be attached 






TOTAL 


$ 
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5. [ ] The Commissioner is hereby authorized to charge any fees which may be required in addition 

to those calculated above, or credit any overpayment to Deposit Account 
No. 11-1153. 

6. [ ] A check in the amount of is enclosed. 

7. [ ] Cancel in this application original Claims of the prior application before calculating the 

Filing Fee. 

8. [X] Amend the specification by inserting before the first line the sentence: 

This Application is a [X] Continuation [ ] Division, of Application Serial No. 09/057,416, 
filed April 8, 1998 . 

9. [ ] Transfer the drawings from the prior application to this application and abandon said prior 

application as of the Filing Date accorded this application. A duplicate copy of this sheet is 
enclosed for filing in the prior application file. 

10. [X] New Formal Drawings are enclosed. 

1 1. [ ] Priority of Application Serial No. , filed on in is claimed 

under 35 U.S.C §119. 

12. [ ] The certified copy has been filed in prior application Serial No. , filed . 

13. [X] The prior application is assigned of record to The Rockefeller University . 

14. [X] The power of attorney in the prior application is to: 

Jackson Matalon (Attorney, Registration No. 22,441); Stefan J. Klauber (Attorney, 
Registration No. 22,604); David A, Jackson (Attorney, Registration No. 26,742); Michael D. 
Davis (Attorney, Registration No. 39,161); William C. Coppola (Attorney, Registration No. 
41,687); Mark S. Cohen (Attorney, Registration No. 42,425); and Christine E. Dietzel 
(Agent, Registration No. 37,309). 

a. [X] The power appears in the original papers in the prior application, a copy of which is enclosed. 

b. [XI Associate Power of Attorney is enclosed. 

c. [X] Address all future communications to: 

David A. Jackson, Esq. 
KLAUBER & JACKSON 
411 Hackensack Avenue 
Hackensack, NJ 0760 L 

15. [ ] A Preliminary Amendment is enclosed. 

16. [XI A copy of the Information Disclosure Statement from prior application. Pursuant to 37 

C.F.R. 1.97(d), copies of the cited references are not enclosed. 
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17. [X] 



I hereby verify that the attached papers are a true copy of prior application Serial No, 
09/057,416 , as originally filed on April 8, 1998 , 



18. [X] 



I hereby state that the content of the paper and computer readable copies of the Sequence 
Listing submitted in accordance with 37 CFR §1. 821(c) and (e)> respectively, are the same. 



The undersigned declares further that all statements made herein of his/her own knowledge are true and that 
all statements made on information and belief are believed to be true; and further that these statements were 
made with the knowledge that willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under section 1001 of Title 18 of the United States Code and that such willful false 
statements may jeopardize the validity of the application or any patent issuing thereon. 



Date: August 18, 2000 




Attorney fb^Ap^Kcant 
Registration No. 26,742 



KLAUBER & JACKSON 
411 Hackensack Avenue 
Hackensack, NJ 07601 
(201) 487-5800 
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Patent 

Attorney's Docket No. 600-1 -179N 



Applicant or Patentee: Olaa Yurieva et at 
Application or Patent No.: 09/057.416 




Filed or Issued: Aoril 8. 1998 

For: ENZYME DERIVED FROM THERMOPHILIC ORGANISMS THAT FUNCTIONS AS A 

CHROMOSOMAL REPLiCASE. AND PREPARATION AND USES THEREOF 

VERIFIED STATEMENT (DECLARATION) CLAIMING SMALL ENTITY 
STATUS (37 C.F.R. §§ 1.9(f) AND 1.27(d)) - NONPROFIT ORGANIZATION 

I hereby declare that l am an official empowered to act on behalf of the nonprofit organization 
identified below: 

NAME OF ORGANIZATION THE ROCKEFELLER UNIVERSITY 

ADDRESS OF ORGANIZATION 1230 YORK AVENUE. NEW YORK. NEW YORK 10021-6399 



TYPE OF ORGANIZATION 

[X] University or other institution of higher education 

[ ] Tax exempt under Internal Revenue Service Code (26 U.S.C. §§ 501(a) and 
501(c)(3)) 

[ ] Nonprofit scientific or educational under statute of state of The United States of 
America 

(Name of state , ) 

(Citation of statute ) 

[ ] Would qualify as tax exempt under Internal Revenue Service Code 

{26 U.S.C. §§ 501(a) and 501(c)(3)) if located in The United States of America 
[ ] Would qualify as nonprofit scientific or educational under statute of The United 

States of America if located in The United States of America 

(Name of state ) 

(Citation of statute ) 



I hereby declare that the nonprofit organization identified above qualifies as a nonprofit organization 
as defined in 37 C.F.R. § 1.9(e) for purposes of paying reduced fees under Sections 41(a) and 
41(b) of Title 35, United States Code, with regard to the invention entitled by inventor(s) 
described in 



[ ] the specification filed herewith 

[X] Application No. 09/057,416 

[ ] Patent No. , issued 

I hereby declare that rights under contract or law have been conveyed to and remain with the 
nonprofit organization with regard to the above-identified invention. 

If the rights held by the above-identifted nonprofit organization are not exclusive, each individual, 
concern, or organization having rights to the invention is listed below,* and no rights to the 
invention are held by any person, other than the inventor, who would not qualify as an individual 
inventor under 37 C.F.R. § 1.9(c), or by any concern that would not qualify as either a small 
business concern under 37 C.F.R. § 1.9(d) or a nonprofit organization under 37 C.F.R. § 1.9(e). 
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*NOTE: Separate verified statements are required from each named person, 
concern, or organization having rights to the invention averring to their status as 
small entities. {37 C.F.R. § 1.27.) 



FULL NAME 

ADDRESS „ 

[ I individual [ ] small business concern [ ] nonprofit organization 
FULL NAME 

ADDRESS 

[ ] individual [ ] small business concern [ ] nonprofit organization 

I acknowledge the duty to file, in this application or patent, notification of any change in status 
resulting in loss of entitlement to small entity status prior to paying, or at the time of paying, the 
earlier of the issue fee and any maintenance fee due after the date on which status as a small entity 
is no longer appropriate. (37 C.F.R. § 1.28(b).) 

I hereby declare that all statements made herein of my own knowledge are true and that ail 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code; and that such willful false statements may jeopardize the validity of the application, any 
patent issuing thereon, or any patent to which this verified statement is directed. 

NAME OF PERSON SIGNING WILLIAM H. GRIESAR 

TITLE IN ORGANIZATION VICE PRESIDENT AND GENERA! COUNSp i 

ADDRESS OF PERSON SIGNING 1 230 YORK AVENUE. NEW YORK. NY 10021-6399 



SIGNATURE 
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ENZYME DERIVED FROM THERMOPHILIC ORGANISMS THAT 
FUNCTIONS AS A CHROMOSOMAL REPLICASE, AND PREPARATION 

AND USES THEREOF 



FIELD OF THE INVENTION 



5 The present invention relates to thermostable DNA polymerases, and more 

particularly to such polymerases as can serve as chromosomal replicases and are 
derived from thermophilic bacteria. More particularly, the invention extends to DNA 
polymerase Ill-type enzymes from thermophilic bacteria, including recombinant 
subunits thereof, to isolated DNA coding for such polymerases which hybridizes to 

10 DNA probes prepared from the DNA sequence coding for T. thermophilics and its 

subunits, to DNA and antibody probes employed in isolation of said DNA, as well as 
to related methods for isolating said DNA and methods to express and purify the DNA 
and its subunits from the respective genes such as tlnaX, dnaA, dnaN } dnaO, dnaE and 
the like. The invention also relates to the purification and use of T. thermophilusPol 

1 5 Ill-type enzymes in efficient replication of a long natural template. 



BACKGROUND OF THE INVENTION 



Thermostable DNA polymerases have been disclosed previously as set forth in U.S. 
Patent No. 5,192,674 to Oshima et al., U.S. Patent Nos. 5,322,785 and 5,352,778 to 
Comb et al., and U.S. Patent No. 5,545,552, and others. All of the noted references 

20 recite the use of polymerases as important catalytic tools in the practice of molecular 
cloning techniques such as polymerase chain reaction (PCR). Each of the references 
states that a drawback of the extant polymerases are their limited thermostability, and 
consequent useful life in the participation in PCR. Such limitations also manifest 
themselves in the inability to obtain extended lengths of nucleotides, and in the 

25 instance of Taq polymerase, the lack of 3' to 5 r exonuclease activity, and the drawback 
of the inability to excise misinserted nucleotides (Tindall, et al. (1990) Biochemistry 
29:5226-5231). 



r r 
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More generally, such polymerases, including those disclosed in the referenced patents, 
are of the Polymerase I variety as they have are approximately 90-95kDa in size and 
may have 5' to 3' exonuclease activity. They define a single subunit with concomitant 
limits on their ability to hasten the amplification process and to promote the rapid 
5 preparation of longer strands of DNA. 

Chromosomal replicases are composed of several subunits in all organisms (Kornberg 
and Baker, 1992). In keeping with the need to replicate long chromosomes, replicases 
are rapid and highly processive multiprotein machines. All cellular replicases 
examined to date derive their processivity from one subunit that is shaped like a ring 

10 and completely encircles DNA (Kuriyan and O'Donnell, 1993; Kelman and 
O'Donnell, 1994). This "sliding clamp" subunit acts as a mobile tether for the 
polymerase machine (Stukenberg et. al., 1991). The sliding clamp does not assemble 
onto the DNA by itself, but requires a complex of several proteins, called a "clamp 
loader" which couples ATP hydrolysis to the assembly of sliding clamps onto DNA 

15 (O'Donnell et al.. 1992). Hence, cellular replicases are classically comprised of three 
components: a clamp, a clamp loader, and the DNA polymerase, and for purposes of 
the present invention, the foregoing components also serve as a broad definition of a 
"Pol Ill-type enzyme". 

DNA polymerase III holoenzyme (Pol III holoenzyme) is the multi-subunit replicase 
20 of the E. coll chromosome. Pol III holoenzyme is distinguished from Pol I type DNA 
polymerases by its high processivity (>50 kbp) and rapid rate of synthesis (750 nts/s) 
(reviewed in Kornberg and Baker, 1991; Kelman and O'Donnell, 1995). The high 
processivity and speed is rooted in a ring shaped subunit, called 6, that encircles DNA 
and slides along it while tethering the Pol III holoenzyme to the template (Stukenberg 
25 et. al., 1991; Kong et. al, 1992). The ring shaped 6 clamp is assembled around DNA 
by the multisubunit clamp loader, called y complex. The y complex couples the 
energy of ATP hydrolysis to the assembly of the R clamp onto DNA. This y complex 
clamp loader is an integral component of the Pol III holoenzyme particle. A brief 
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overview of the organization of subunits within the holoenzyme and their function 
follows. 

Pol III holoenzyme consists of 10 different subunits. some of which are present in 
multiple copies for a total of 18 polypeptide chains (Onrust et. aL, 1995b). The 
5 organization of these subunits in the holoenzyme particle is illustrated in Fig. 1. As 
depicted in the diagram, the subunits of the holoenzyme can be grouped functionally 
into three components: 1) the DNA polymerase III core is the catalytic unit and 
consists of the a (DNA polymerase), e (3 -5' exonuclease) and 0 subunits (McHenry 
and Crow, 1979), 2) the B "sliding clamp" is the ring shaped protein that secures the 

10 core polymerase to DNA for processivity (Kong et. ah, 1992), and 3) the 5 protein y 
complex (y55'x^) is the "clamp loader" that couples ATP hydrolysis to assembly of 6 
clamps around DNA (O'Donnell, 1987; Maki and Kornberg, 1988). A dimer of the t 
subunit acts as a "macromolecular organizer" holding together two molecules of core 
and one molecule of y complex forming the Pol III* subassembly (Onrust et. aL, 

1 5 1 995b). This organizing role of x to form Pol III* is indicated in the center of Fig. 1 . 
Two B dimers associate with the two cores within Pol III* to form the holoenzyme 
capable of replicating both strands of duplex DNA simultaneously (Maki et. aL, 
1998). 

The DNA polymerase III holoenzyme assembles onto a primed template in two 
20 distinct steps. In the first step, the y complex assembles the B clamp onto the DNA. 
The y complex and the core polymerase utilize the same surface of the B ring and they 
cannot both utilize it at the same time (Naktinis et. aL, 1996). Hence, in the second 
step the y complex moves away from 6 thus allowing access of the core polymerase to 
the B clamp for processive DNA synthesis. The y complex and core remain attached 
25 to each other during this switching process by the x subunit organizer. 

The y complex consists of 5 different subunits ^-AS'iXi^i). An overview of the 
mechanism of the clamp loading process follows. The 8 subunit is the major touch 
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point to the B clamp and leads to ring opening, but 6 is buried within y complex such 
that contact with B is prevented (Naktinis et. al., 1995). The y subunit is the ATP 
interactive protein but is not an ATPase by itself (Tsuchihashi and Kornberg, 1989). 
The 8' subunit bridges the 8 and y subunits resulting in a yb& complex that exhibits 
5 DNA dependent ATPase activity and is competent to assemble clamps on DNA 

(Onrust et. al., 1991). Upon binding of ATP to y, a change in the conformation of the 
complex exposes S for interaction with B (Naktinis et. al., 1995). The function of the 
smaller subunits, x and i|r, is to contact SSB (through %) thus promoting clamp 
assembly and high processivity during replication (Kelman and O'Donnell, 1995). 

10 The three component Pol Ill-type enzyme in eukaryotes contains a clamp that has the 
same shape as E. coli 6, but instead of a homodimer it is a heterotrimer. This 
hetertrimeric ring, called PCNA (proliferating cell nuclear antigen), has 6 domains 
like B, but instead of each PCNA monomer being composed of 3 domains and 
dimerizing to form a 6 domain ring (e.g. like 8), the PCNA monomer has 2 domains 

1 5 and it trimerizes to form a 6 domain ring (Krishna et. al, 1994; Kuriyan and 

O'DonnelL 1993). The chain fold of the domains are the same in prokaryotes (B) and 
eukaryotes (PCNA) and thus the rings have the same overall 6-domain ring shape. 
The clamp loader of the eukaryotic Pol Ill-type replicase is called RFC (Replication 
factor C) and it consists of subunits having homolgy to the y and 8' subunits of the E. 

20 coli y complex. The eukaryotic DNA polymerase Ill-type enzyme contains either of 
two DNA polymerases, DNA polymerase 8 and DNA polymerase e. It is entirely 
conceivable that yet other types of DNA polymerases can function with either a 
PCNA or B clamp to form a Pol Ill-type enzyme (for example, DNA polymerase II of 
E. coli functions with the B subunit placed onto DNA by the y complex clamp loader). 

25 The bacteriophage T4 also utilizes a Pol Ill-type 3-component replicase. The clamp is 
a homotrimer like PCNA, called gene 45 protein. The gene 45 protein forms the same 
6-domain ring structure as 6 and PCNA. The clamp loader is a complex of two 
subunits called the gene 44/62 protein complex. The DNA polymerase is the gene 43 
protein and it is stimulated by the gene 45 sliding clamp when it is assembled onto 
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DNA by the 44/62 protein clamp loader. The Pol Ill-type enzyme may be either 
bound together into one particle (e.g., E. coli Pol III holoenzyme), or its three 
components may not be assembled together into a stable particle in solution (like the 
eukaryotic Pol Ill-type replicases). 



5 There is an early report on separation of three DNA polymerases from T.th. cells, 
however each polymerase form was reminiscent of the preexisting types of DNA 
polymerase isolated from thermophiles in that each polymerase was in the 1 10,000- 
120,000 range and lacked 3 -5 f exonuclease activity (Ruttimann et. AL, 1985). These 
are well below the molecular weight of Pol Ill-type complexes that contain in addition 

10 to the DNA polymerase subunit, other subunits such as y and t. Although the three 
polymerases displayed some differences in activity (column elution behavior, and 
optimum divalent cation, template, and temperatures) it seems likely that these three 
forms were either different repair type polymerases or derivatives of one repair 
enzyme (e.g. Pol I) that was modified into three forms by post translational 

15 modification(s) that altered their properties (e.g. phosphorylation, methylation, slight 
proteolytic clipping of residues that alter activity, or association with different ligands 
such as a small protein or contaminating DNA). Despite this previous work, it 
remained to be demonstrated that thermophiles harbor a Pol Ill-type enzyme that 
contain multiple subunits such as y and/or x, functioned with a sliding clamp 

20 accessory protein, or could extend a primer over a long stretch of ssDNA. Ruttimann, 
C, Cotoras, M., Zaldivar, J., and Vicuna, R. (1986) DNA polymerases from the 
extremely thermophilic bacterium Thermus thermophilus HB-8. European J., of 
Biochem. 149,41-46. 

Previously it was not known how thermophilic bacteria replicated - only Pol Fs have 
25 been reported. By distinction, chromsomal replicases such as Polymerase HI 

identified in E. coli, if available in a thermostable bacterium, with all its accessory 
subunits, could provide a great improvement over the Polymerase I's, in that they are 
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generally much more efficient - about 5 times faster and much more highly 
processive. Hence, one may expect faster and longer chain production in PCR, and 
higher quality of DNA sequencing ladders. Clearly the ability to practice such 
synthetic techniques as PCR would be enhanced by these methods disclosed for how 
5 to obtain genes and subunits of DNA polymerase III holoenzyme from thermophilic 
sources. 

SUMMARY OF THE INVENTION 

In accordance with the present invention, DNA Polymerase Ill-type enzymes as 
defined herein are disclosed that may be isolated and purified from a thermophilic 

10 bacterial source, that can function as a chromosomal replicase, and that possesses all 
of the structural and processive advantages sought and recited above. More 
particularly, the invention extends to the Polymerase Ill-type enzymes derived from 
thermostable thermophilic bacteria that exhibit the ability to extend a primer over a 
long stretch of ssDNA at elevated temperature, the ability to be stimulated by a 

15 cognate sliding clamp of the type that is assembled on DNA by a *clamp' loader (e.g. 
y complex), have clamp loading sub-units that show DNA stimulated ATPase activity 
at elevated temperature and/or ionic strength, and have a DNA polymerase-associated 
3'-5 r exonuclease activity (e.g., e subunit). Representative thermophiles include 
polymerases isolated from the thermophilic bacteria Thermits thermophilus (Tth. 

20 polymerase), Thermococcus litoralis (77/ or VENT iM polymerase), Pyrococcus 

furiosus (Pfu or DEEPVENT polymerase), Pyrococcus woosii (Pwo polymerase) and 
other Pyrococcus species, Bacillus sterothermophilus (Bst polymerase), sulfolobus 
acidocaldarius (Sac polymerase), thermoplasma acidophilum (Tac polymerase), 
Thermus favus (Tfl/Tub polymerase), Thermus ruber (Tru polymerase), Thermus 

25 brockianus (DYNAZYME™ polymerase), Thermotoga neapolitana (Tne polymerase; 
See WO 96/10640), Thermotoga maritima (Tma polymerase; See U.S. Patent No. 
5,374,553) and other species of the Thermotoga genus (Tsp polymerase) and 
Methanobacterium thermoautotrophicum (Mth polymerase). In a preferred 
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embodiment, the thermophilic comprise those of the Thermits and Thermotoga 
species, and particularly T.th., and Tne and Tma. 

A particular Polymerase Ill-type enzyme in accordance with the invention may 
include at least one of the following sub-units: 
5 A. ay subunit having an amino acid sequence selected from the formula 

set forth in SEQ ID NOS:4 and 5; 

B. ax subunit having an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO:2; 

C a e subunit having an amino acid sequence corresponding to the 
1 0 formula set forth in SEQ ID NO:95; 

D. a a subunit including an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO:87; 

E. a B subunit having an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO: 107; and 

15 variants, including allelic variants, muteins. analogs and fragments of any of 

subparts (A) through (E), and combinations thereof, capable of functioning in DNA 
amplification and sequencing. 

The invention also extends to the genes that correspond to and can code on expression 
for the subunits set forth above, and accordingly includes the following: dnaX, dnaQ, 
20 dnaE and dnaN, and conserved variants and active fragments thereof. 

Accordingly, the Polymerase Ill-type enzyme of the present invention comprises at 
least one gene encoding a subunit thereof, which gene is selected from the group 
consisting of dnaX, dnaQ, dnaE and dnaN, and combinations thereof More 
particularly, the invention extends to the nucleic acid molecule encoding the y and x 
25 subunits, and includes the dnaXgene which has a nucleotide sequence as set forth in 
SEQ ID NO. 3, as well as conserved variants, active fragments and analogs thereof. 
Likewise, the nucleotide sequences encoding the a subunit (the dnaE gene), the e 
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subunit (dnaQ gene) and the P subunit {dnaN gene) each comprise the nucleotide 
sequences as set forth respectively, in SEQ ID NO'S: 94; 86 and 106, as well as 
conserved variants, active fragments and analogs thereof. 

The invention also provides methods and products for identifying, isolating and 
5 cloning DNA molecules which encode such accessory subunits encoded by the recited 
genes of the DNA polymerase Ill-type enzyme hereof. 

Yet further, the invention extends to Polymerase Ill-type enzymes prepared by the 
purification of an extract taken from e.g. the particular thermophile under 
examination, treated with appropriate solvents and then subjected to chromatographic 
10 separation on e.g. an anion exchange column, followed by analysis of long chain 
synthetic ability or Western analysis of the respective peaks against antibody to at 
least one of the anticipated enzyme subunits to confirm presence of Pol III, and 
thereafter, peptide sequencing of subunits that co purify and amplification to obtain 
the putative gene and itss encoded enzyme. 

15 The present invention also relates to recombinant y, cc and fi subunits from 
thermophiles. In the instance of the y and x subunits, the invention includes the 
characterization of a frameshifting sequence that is internal to the gene and specifies 
relative abundance of the y and x gene products of dnaX. From this characterization it 
is obvious how to increase expression of either one of the subunits at the expense of 

20 the other (i.e. mutant frameshift could make all t, simple recloning at the end of the 
frameshift could make exclusively y and no x). 

In a further aspect of the present invention, DNA probes can be constructed from the 
DNA sequences coding for, eg. the T.tk dnaX, dnaQ, dnaE, dnaA and dnaN genes, 
conserved variants and active fragments thereof, all as defined herein, and may be 
25 used to identify and isolate the corresponding genes coding for the subunits of DNA 
polymerase III holoenzyme from other thermophiles, such as those listed earlier 
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herein. Accordingly, all chromosomal replicases (DNA Polymerase Ill-type) from 
thermophilic sources are contemplated and included herein. 

The invention also extends to methods for identifying Polymerase Ill-type enzymes 
by use of the techniques of long-chain extension and elucidation of subuits with 
5 antibodies, as described herein and with reference to the examples. 

The invention further extends to the isolated and purified DNA Polymerase III, the 
amino acid sequences of the y, t, e, a and 6 subunits, as set forth in SEQ ID NOS:4, 
5, 2, 95, 87, and 107, and the nucleotide sequences of the corresponding genes from 
T.th set forth, e.g in SEQ ID NOS:3 (dnaX), 94 (dnaQ), 86 (dnaE) and 106 (dnaN), 

10 as well as to active fragments thereof, oligonucleotides and probes prepared or derived 
therefrom and the transformed cells that may be likewise prepared. Accordingly, the 
invention comprises the individual subunits enumerated above and hereinafter, 
corresponding isolated polynucleotides and respective amino acid sequences for each 
of the y,t, e,a and B subunits, and to conserved variants, fragments, and the like, as 

1 5 well as to methods of their preparation and use in DNA amplification and sequencing. 
In a particular embodiment, the invention extends to vectors for the expression of the 
sub-unit genes of the present invention, and more specifically to the vectors 
pE7 l6dnaX and pET24dnai\ r . 

The invention also includes methods for the preparation of the DNA Polymerase III- 
20 type enzymes and the corresponding subunit genes of the present invention, and to the 
use of the enzymes and constructs having active fragments thereof, in the preparation, 
reconstitution of modification of like enzymes, as well as in amplification and 
sequencing of DNA by methods such as PGR, and like protocols, and to the DNA 
molecules amplified and sequenced by such methods. In this regard, a Pol Ill-type 
25 enzyme that is reconstituted in the absence of e, or using a mutated e with less 3-5' 
exonuclease activity, may be a superior enzyme in either PCR or DNA sequencing 
applications, (e.g. Tabor and Richardson, 1995.) 
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The invention is directed to methods for amplifying and sequencing a DNA molecule, 
particularly via the polymerase chain reaction (PCR), using the present DNA 
polymerase Ill-type enzymes or complexes. In particular, the invention extends to 
methods of amplifying and sequencing of DNA using thermostable pol Ill-type 
enzyme complexes isolated from thermophilic bacteria such as Thermotoga and 
Thermus species, or recombinant thermostable enzymes. The invention also provides 
amplified DNA molecules made by the methods of the invention, and kits for 
amplifying or sequencing a DNA molecule by the methods of the invention. 

In this connection, the invention extends to methods for amplification of DNA that 
can achieve long chain extension of primed DNA, as by the application and use of 
Polymerase Ill-type enzymes of the present invention. An illustration of such 
methods is presented in Examples 13 and 14, infra. 

Likewise, kits for amplification and sequencing of such DNA molecules are included, 
which kits contain the enzymes of the present invention, including subunits thereof, 
together with other necessary or desirable reagents and materials, and directions for 
use. The details of the practice of the invention as set forth above and later on herein, 
and with reference to the patents and literature cited herein, are all expressly 
incorporated herein by reference and made a part hereof. 

As stated, and in accordance with a principal object of the present invention, 
Polymerase Ill-type enzymes and their sub-units are provided that are derived from 
thermophiles and that are adapted to participate in improved DNA amplification and 
sequencing techniques, and the consequent ability to prepare larger DNA strands more 
rapidly and accurately. 

It is a further object of the present invention to provide DNA molecules that are 
amplified and sequenced using the Polymerase Ill-type enzymes hereof. 
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It is a still further object of the present invention to provide enzymes and 
corresponding methods for amplification and sequencing of DNA that can be 
practiced without the participation of the clamp-loading component of the enzyme. 

It is a still further object of the present invention to provide kits and other assemblies 
5 of materials for the practice of the methods of amplification and sequencing as 

aforesaid, that include and use the DNA polymerase Ill-type enzymes herein as part 
thereof. 

Other objects and advantages will become apparent from a review of the ensuing 
description which proceeds with reference to the following illustrative drawings. 

10 DESCRIPTION OF THE DRAWINGS 

FIGURE 1 is a schematic depiction of the structure and components of enzymes of the 
general family to which the enzymes of the present invention belong. 

FIGURE 2. Alignment of the N-terminal regions of E. coli and B. subtilis dnaX gene 
product - Asterisks indicate identities. The ATP binding consensus sequence is 
15 indicated. The two regions used for PCR primer design are shown in bold. 

FIGURE 3. Southern analysis of T. thermophilus genomic DNA - Genomic DNA 
was analyzed for presence of the DnaZ gene using the PCR radiolabeled probe. 
Enzymes used for digestion are shown above each lane. The numbering to the right 
corresponds to the length of DNA fragments (kb). 

20 FIGURES 4A and 4B depict the full sequence of the dnaX gene of T, thermophilus - 
DNA sequence (upper case, and corresponding to SEQ ID NO:l) and predicted- amino 
acid sequence (lower case, and corresponding to SEQ ID NO:2) yields a 529 amino 
acid protein (r) of 58.0 kDa. A putative frameshifting sequence containing several A 
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residues 1478-1486 (underlined) may produce a smaller protein (y) of 49.8 kDa. The 
potential Shine-Dalgarno (S.D.) signal is bold and underlined. The start codon is in 
bold, and the stop codon for t is marked by an asterisk. The potential stop codon for 
y is shown in bold after the frameshift site, and two potential Shine-Dalgarno 
5 sequences upstream of the frameshift site are indicated. Sequences of the primers 
used for PCR are shown in italics above the nucleotide sequence of dnaX. The ATP 
binding site is indicated, and the asterisks above the four Cys residues near the ATP 
site indicate the putative Zn+-b finger. The proline rich area is indicated above the 
sequence. Numbering of the nucleotide sequence is presented to the right. 
10 Numbering of the amino acid sequence of x is shown in parenthesis to the right. 

FIGURE 4C depicts the isolated DNA coding sequence for the dnaX gene (also 
present in FIGURES 3 A and 3B) in accordance with the invention, and corresponds to 
SEQIDNO:3. 

FIGURE 4D depicts the polypeptide sequence of the y subunit of the Polymerase III 
15 of the present invention, and corresponds to SEQ ID NO:4. 

FIGURE 4E depicts the polypeptide sequence of the y subunit of the Polymerase III 
of the present invention defined by a -1 frameshift, and corresponds to SEQ ID NO:4. 

FIGURE 4F depicts the polypeptide sequence of the y subunit of the Polymerase III 
20 of the present invention defined by a -2 frameshift, and corresponds to SEQ ID NO:5. 

FIGURE 5. Alignment of the y/x ATP binding domains for different bacteria - Dots 
indicate those residues that are identical to the E. coli dnaX sequence. The ATP 
consensus site is underlined, and the conserved cysteine residues that form the zinc 
finger are indicated with asterisks. E. coli, Escherichia coli; K inf., Haemophilus 
25 influenzae; B. sub., Bacillus subtilis; C. cres., Caulobacter crescentus; M gen., 
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Mycoplasma genitalium; Ttk, Thermus thermophilic. Alignments were produced 
using Clustel. 



FIGURE 6. Signal for ribosomal frameshifting in T.tk dnaX- The diagram shows 
part of the sequence of the RNA around the frameshifting site, including the suspected 
5 slippery sequence A9 (bold italic). The stop codon in the -2 reading frame is 
indicated. Also indicated are potential step loop structures and the nearest stop 
codons in the -1 reading frame. 

FIGURE 7. Analysis of y and x in T.tk cells by Western - Whole cells were lysed in 
SDS and electrophoresed on a 10 % SDS polyacrylamide gel then transferred to a 
10 membrane and probed with polyclonal antibody against E. coli y/x as described in 
Experimental Procedures. Positions of molecular weight size markers are shown to 
the left. Putative Ttk y and x are indicated to the right. 

FIGURE 8. The frameshift sequence in Ttk dnoX promotes -1 and -2 frameshifts in 
E. coli - The region of the dnaX gene slippery sequence was cloned into the lacZ gene 

15 of pUC19 in three reading frames, then transformed into E. coli cells and plated on 
LB plates containing X-gal. The slippery sequence was also mutated by inserting two 
G residues into the A9 sequence and then cloned into pUC19 in all three reading 
frames. Color of colonies observed are indicated by the plus signs. The picture 
shows the colonies, the type of frameshift required for readthrough (blue color) is 

20 indicted next to the sector. 

FIGURE 9. Construction of the T.th ylx expression vector - A genomic fragment 
containing a partial sequence of dnoX was cloned into pALTER-1. This fragment was 
subcloned into pUC19 (pTJC\9_dnaX). Then the N-terminal section of dnaXvsas 
amplified such that the fragment was flanked by Ndel (at the initiating codon) and the 
25 internal BamHI site. This fragment was inserted to form the entire coding sequence of 
the dnoX gene in pUC19 (pUCl9dnaX). The dnoX gene was then cloned behind the 
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polyhistidine leader in the T7 based expression vector pET16 to give pETl6dnaX. 
Details are in "Experimental Procedures". 

FIGURE 10. Purification of recombinant T.th. y and x subunits - T.th. y and r 
subunits were expressed in £ coli harboring ^El\6dnaX. Molecular size markers are 

5 shown to the left of the gels, and the two induced proteins are labelled as g and t to the 
right of the gel. Panel A) 10% SDS gel of £ coli whole cell lysates before and after 
induction with IPTG. Panel B) 8% SDS gel of the purification two steps after cell 
lysis. First lane: the lysate was applied to a HiTrap Nickel chromatography column. 
Second lane: the T.th. ylx subunits were further purified on a Superose 12 gel 

10 filtration column. Third lane, the £ coli y and x subunits. Panel C) Western analysis 
of the pure T.th. y and x subunits (first lane) and £ coli y and x subunits (second 
lane). 

FIGURE 11. Gel filtration of T.th. y and x - T.th. y and x were gel filtered on a 
Superose 12 column. Column fractions were analyzed for ATPase activity and in a 
15 Coomassie Blue stained 10% SDS polyacrylamide gel. Positions of molecular weight 
markers are shown to the left of the gel. The elution position of size standards 
analyzed in a parallel Superose 12 column under identical conditions are indicated 
above the gel. Thyroglobin (670 kDa), bovine gamma globin (150 kDa), chicken 
ovalbumin (44 kDa). equine myoglobin (17 kDa). 

20 FIGURE 12. Characterization of the T.th. y and x ATPase activity - The T.th. ylx and 
£ coli x subunits are compared in their ATPase activity characteristics. Due to the 
greater activity of £ coli x, the values are plotted as percent for ease of comparison. 
Actual specific activities for 100 % values are given below as pmol ATP 
hydrolyzed/30 min./pmol T.th. ylx (or pmol £ coli x). Panel A) T.th. y and x ATPase 

25 is stimulated by the presence of ssDNA. T.th. ylx was incubated at 65 °C. Specific 
activity was: 1 1.5 (-DNA); 2.5 (-DNA); £ coli x was assayed at 37°C. Specific 
activity values were: 1 12.5 (+DNA); (7.3-DNA). Panel B) Temperature stability of 
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DNA stimulated ATPase activity. T.th yfx, 11.3 (65°C); E. coli x, 97.5 (37°C). 
Panel C) Stability of T.th y/x ATPase to NaCl. T.th. y/x,8A (100 mM added NaCl 
and 65oC); E. coli x, 52.7 (0 M added NaCl and 37°C). 

FIGURES 13A-13C are graphs that summarize the purification of the DNA 
5 polymerase III from T.th, extracts. A) shows the activity and total protein in column 
fractions from the Heparin Agarose column. Peak 1 fractions were chromatographed 
on ATP agarose, and Panel B) depicts the ATP-agarose column step, and Panel C) 
shows the total protein and DNA polymerase activity eluted from the MonoQ column. 

FIGURE 14 is a 12% SDS polyacrylamide gel stained with Coomassie Blue (Panel A) 
1 0 of the MonoQ column. Loud stands for the material loaded onto the column (ATP 
agarose bound fractions). FT stands for protein that flowed through the MonoQ 
column. Fractions are indicated above the gel. T.th. subunits a ? x, y, S, 6' in fractions 
17-19 are indicated by the labels placed between fractions 1 8 and 19. Additional 
small subunits may be present but difficult to visualize, or may have run off the gel. 
15 E. Coli, y, 6 shows a mixture of the a, y and 5 subunits of DNA polymerase III 

holoenzyme (they are labelled to the right in the figure). Panel B shows the Western 
results of an SDS gel of the MonoQ fractions probed with rabbit antiserum raised 
against the E. coli a subunit L and FT are as described in Panel A. Fraction numbers 
are shown above the gel. The band that comigrates with E. coli cc, and the band in the 
20 Coomassie Blue stained gel in Panel A, is marked with an arrow. This band was 
analyzed for microsequence and the results are shown in Fig. 15. 

FIGURE 15 shows the alignments of the peptides obtained from T.th a subunit, 
TTH1 (shown in A) and TTH2 (shown in B) with the amino acid sequences of the a 
subunits of other organisms. The amino acid number of these regions within each 
25 respective protein sequence are shown to the right. The abbreviations of the 

organisms are as follows. E.coli - Escherichia coli, V.choi- Vibrio cholerae, H.inf. - 
Haemophilus influenzae^ R.prow. - Rickettsia prowazekii, H.pyl. - Helicobacter 
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pylori, S.sp. - Synechocystis sp., M.tuh - Mycobacterium tuberculosis, T.th - Thermus 
thermophilic. 

FIGURE 16 shows a partial nucleotide (Panel A) and amino acid (Panel B) sequence 
of the dnaE gene encoding the a subunit of DNA polymerase III holoenzyme. The 
5 peptide sequence in bold was obtained by microsequencing of the a subunit isolated 
from T.th. cells. 

FIGURE 17 shows an alignment of the amino acid sequence of e subunits encoded by 
dnaO of several organisms. The amino acid sequence of the Thermus thermophilus e 
subunit of dnaO is also shown. T.th, Thermus thermophilus; D.rad, Deinococcus 
10 radiodurans; Bac.suh. Bacillus subtilis: H.inf, Haemophilus influenzae; Re, 

Escherichia coli; Hpyl, Helicobacter py lori. The regions used to obtain the inner 
part of the dnaQ gene are shown in bold. The starts used for expression of the T.th e 
subunit are marked. 

FIGURE 18 shows the nucleotide (Panel A) and amino acid (Panel B) sequence of the 
1 5 dnaQ gene encoding the e subunit of DNA polymerase III holoenzyme. 

FIGURE 19 shows an alignment of the DnaA protein of several organisms. The 
amino acid sequence of the Thermus thermophilus DnaA protein of is also shown. 
T.th., Thermus thermophilus; Bac.sub., Bacillus subtilis; Re, Escherichia coli; Hpyl, 
Helicobacter pylori; M. tub; Mycobacterium tuberculosis; T mar., Thermatoga 
20 maritima. 

FIGURE 20 shows the nucleotide (Panel A) and amino acid (Panel B) sequence of the 
dnaA gene of Thermus thermophilus. 

FIGURE 21 shows the nucleotide (Panel A) and amino acid (Panel B) sequence of the 
dnaN gene encoding the fi subunit of DNA polymerase III holoenzyme. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 
GTGTGGATCC TTCTTCTTSC CCATSGC 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc » " PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 
CACCGATTCC AGTGGTGCCT AGGTGTG 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 8 
CAACACCTGG TGTTCCAGGA GCCTGTGCTT 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
CCAGAATCGT CTGCTGGTCG TAG 
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FIGURE 26 shows the use of T.th. Pol III in extending singly primed M13mpl8 to an 
RFII form. The scheme at the top shows the primed template in which a DNA 57mer 
was annealled to the M13mpl8 ssDNA circle. Then TJh. R subunit (produced 
recombinantly) and T.th. Pol III were added to the DNA in the presence of radioactive 
5 nucleoside triphosphates. In panel B, the products of the reaction were analyzed in a 
0.8% native agarose gel. The position of ssDNA starting material, the RFII product, 
and of intermediate species, are shown to the sides of the gel. Lane 1, use of Pol III 
from the Heparin Agarose peak 1 . Lane 2, use of the non-Pol III DNA polymerase 
contained in the peak 2 of the TJh. Heparin Agarose column. 

10 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill 
of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook 
et al, "Molecular Cloning: A Laboratory Manual" (1989); "Current Protocols in 

15 Molecular Biology" Volumes Mil [Ausubel, R. M.. ed. (1994)]; "Cell Biology: A 
Laboratory Handbook" Volumes I-III [J. E. Celis, ed. (1994)]; "Current Protocols in 
Immunology" Volumes I-III [Coligan. J. E., ed. (1994)]; "Oligonucleotide Synthesis" 
(MJ. Gait ed. 1984): "Nucleic Acid Hybridization" [B.D. Hames & S J. Higgins eds. 
(1985)]; "Transcription And Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; 

20 "Animal Cell Culture" [R.I. Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" 
[IRL Press, (1986)]; B. Perbal, "A Practical Guide To Molecular Cloning" (1984). 

Therefore, if appearing herein, the following terms shall have the definitions set out 
below. 

25 The terms "DNA Polymerase HI," "Polymerase Ill-type enzyme(s)", "Polymerase III 
enzyme complex(s)", "TJh. DNA Polymerase III", "clamp loader" and any variants 
not specifically listed, may be used herein interchangeably, as are p subunit and 
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sliding clamp and clamp as are also y complex, clamp loader and RFC, as used 
throughout the present application and claims refer to proteinaceous material 
including single or multiple proteins, and extends to those proteins having the amino 
acid sequence data described herein and presented in the Figures and corresponding 
5 Sequence Listing entries, and the corresponding profile of activities set forth herein 
and in the Claims. Accordingly, proteins displaying substantially equivalent or 
altered activity are likewise contemplated. These modifications may be deliberate, for 
example, such as modifications obtained through site-directed mutagenesis, or may be 
accidental, such as those obtained through mutations in hosts that are producers of the 
10 complex or its named subunits. Also, the terms "DNA Polymerase III," "T.th. DNA 
Polymerase III," and "y and x subunits"-, "B subunit", "a subunit", "e subunit", 
"sliding clamp" and "clamp loader" are intended to include within their scope proteins 
specifically recited herein as well as all substantially homologous analogs and allelic 
variations. 

15 Also as used herein, the term "thermolabile enzyme" refers to a DNA polymerase 
which is not resistant to inactivation by heat. For example, T5 DNA polymerase, the 
activity of which is totally inactivated by exposing the enzyme to a temperature of 
90 °C for 30 seconds, is considered to be a thermolabile DNA polymerase. As used 
herein, a thermolabile DNA polymerase is less resistant to heat inactivation than in a 

20 thermostable DNA polymerase. A thermolabile DNA polymerase typically will also 
have a lower optimum temperature than a thermostable DNA polymerase. 
Thermolabile DNA polymerases are typically isolated from mesophilic organisms, for 
example mesophilic bacteria or eukaryotes, including certain animals. 

25 As used herein, the term "thermostable enzyme" refers to an enzyme which is stable to 
heat and is heat resistant and catalyzes (facilitates) combination of the nucleotides in 
the proper manner to form the primer extension products that are complementary to 
each nucleic acid strand. Generally, the synthesis will be initiated at the 3' end of each 
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primer and will proceed in the 5' direction along the template strand, until synthesis 
terminates, producing molecules of different lengths. 

The thermostable enzyme herein must satisfy a single criterion to be effective for the 
amplification reaction, i.e., the enzyme must not become irreversibly denatured 
5 (inactivated) when subjected to the elevated temperatures for the time necessary to 
effect denaturation of double-stranded nucleic acids. Irreversible denaturation for 
purposes herein refers to permanent and complete loss of enzymatic activity. The 
heating conditions necessary for denaturation will depend, e.g., on the buffer salt 
concentration and the length and nucleotide composition of the nucleic acids being 
10 denatured, but typically range from about 90° to about 96 °C for a time depending 
mainly on the temperature and the nucleic acid length, typically about 0.5 to four 
minutes. Higher temperatures may be tolerated as the buffer salt concentration and/or 
GC composition of the nucleic acid is increased. Preferably, the enzyme will not 
become irreversibly denatured at about 90°-100°C 

1 5 The thermostable enzymes herein preferably have an optimum temperature at which 
they function that is higher than about 40 °C, which is the temperature below which 
hybridization of primer to template is promoted, although, depending on (1) 
magnesium and salt concentrations and (2) composition and length of primer, 
hybridization can occur at higher temperature (e.g.. 45°-70°C). The higher the 

20 temperature optimum for the enzyme, the greater the specificity and/or selectivity of 
the primer-directed extension process. However, enzymes that are active below 40 °C, 
e.g., at 37 °C, are also within the scope of this invention provided they are heat-stable. 
Preferably, the optimum temperature ranges from about 50° to 90 °C, more preferably 
60°-80°C. In this connection, the term "elevated temperature" as used herein is 

25 intended to cover sustained temperatures of operation of the enzyme that are equal to 
or higher than about 60 °C. 
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The term "template" as used herein refers to a double-stranded or single-stranded 
DNA molecule which is to be amplified, synthesized or sequenced. In the case of a 
double-stranded DNA molecule, denaturation of its strands to form a first and a 
second strand is performed before these molecules may be amplified, synthesized or 
5 sequenced. A primer, complementary to a portion of a DNA template is hybridized 
under appropriate conditions and the DNA polymerase of the invention may then 
synthesize a DNA molecule complementary to said template or a portion thereof. The 
newly synthesized DNA molecule, according to the invention, may be equal or shorter 
in length than the original DNA template. Mismatch incorporation during the 
10 synthesis or extension of the newly synthesized DNA molecule may result in one or a 
number of mismatched base pairs. Thus, the synthesized DNA molecule need not be 
exactly complementary to the DNA template. 

The term "incorporating" as used herein means becoming a part of a DNA molecule or 
primer. 

15 As used herein "amplification" refers to any in vitro method for increasing the number 
of copies of a nucleotide sequence, or its complimentary sequence, with the use of a 
DNA polymerase. Nucleic acid amplification results in the incorporation of 
nucleotides into a DNA molecule or primer thereby forming a new DNA molecule 
complementary to a DNA template. The formed DNA molecule and its template can 

20 be used as templates to synthesize additional DNA molecules. As used herein, one 
amplification reaction may consist of many rounds of DNA replication. DNA 
amplification reactions include, for example, polymerase chain reactions (PCR). One 
PCR reaction may consist of 30 to 100 "cycles" of denaturation and synthesis of a 
DNA molecule. In this connection, the use of the term "long stretches of DNA" as it 

25 refers to the extension of primer along DNA is intended to cover such extensions of an 
average length exceeding 7 kilobases. Naturally, such length will vary, and all such 
variations are considered to be included within the scope of the invention. 
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Leu Val Asn He Asn Leu Val Lys He Ala Gin Glu Leu Asp He Lys 
20 25 30 



lie Val 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Asp Asn Tyr Phe Leu Glu Leu Met Asp His Gly Leu Thr He Glu Arg 
1 5 10 15 

Arg Val Arg Asp Gly Leu Leu Glu He Gly Arg Ala Leu Asn He Pro 
20 25 30 

Pro Leu 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Asn Lys Arg Arg Ala Lys Asn Gly Glu Pro Pro Leu Asp He Ala Ala 
1,5 10 15 

He Pro Leu Asp Asp Lys Lys Ser Phe Asp Met Leu Gin Arg Ser Glu 
20 25 30 

Thr Thr Ala Val Phe Gin Leu Glu Ser Arg Gly Met Lys Asp 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
( D } TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: protein 
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nomenclature, J. Biol. Chem., 243:3552-59 (1969). abbreviations for amino acid 
residues are shown in the following Table of Correspondence: 



TABLE OF CORRESPONDENCE 



SYMBOL AMINO ACID 





5 


1 -Letter 


3-Letter 








Y 


Tyr 


tyrosine 






G 


Gly 


glycine 






F 


Phe 


phenylalanine 






M 


Met 


methionine 




10 


A 


Ala 


alanine 






S 


Ser 


serine 






I 


He 


isoleucine 






L 


Leu 


leucine 


rtl 




T 


Thr 


threonine 


m 


15 


V 


Val 


. valine 






P 


Pro 


proline 






K 


Lys 


lysine 






H 


His 


histidine 






Q 


Gin 


glutamine 




20 


E 


Glu 


glutamic acid 






W 


Trp 


tryptophan 






R 


Arg 


arginine 






D 


Asp 


aspartic acid 






N 


Asn 


asparagine 




25 


C 


Cys 


cysteine 



It should be noted that all amino-acid residue sequences are represented herein by 
formulae whose left and right orientation is in the conventional direction of amino- 
terminus to carboxy-terminus. Furthermore, it should be noted that a dash at the 
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beginning or end of an amino acid residue sequence indicates a peptide bond to a 
further sequence of one or more amino-acid residues. The above Table is presented to 
correlate the three-letter and one-letter notations which may appear alternately herein. 

A "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions 
5 as an autonomous unit of DNA replication in vivo; i.e., capable of replication under its 
own control. 

A "vector" is a replicon, such as plasmid, phage or cosmid, to which another DNA 
segment may be attached so as to bring about the replication of the attached segment. 

A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, 
10 guanine, thymine, or cytosine) in its either single stranded form, or a double-stranded 
helix. This term refers only to the primary and secondary structure of the molecule, 
and does not limit it to any particular tertiary forms. Thus, this term includes double- 
stranded DNA found, inter alia, in linear DNA molecules (-e.g., restriction fragments), 
viruses, plasmids, and chromosomes. In discussing the structure of particular double- 
15 stranded DNA molecules, sequences may be described herein according to the normal 
convention of giving only the sequence in the 5 T to 3' direction along the 
nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the 
mRNA). 

An "origin of replication" refers to those DNA sequences that participate in DNA 
20 synthesis. 

A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed 
and translated into a polypeptide in vivo when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a 
start codon at the 5' (amino) terminus and a translation stop codon at the 3 T (carboxyl) 
25 terminus. A coding sequence can include, but is not limited to, prokaryotic 
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sequences. cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic 
(e.g., mammalian) DNA, and even synthetic DNA sequences. A polyadenylation 
signal and transcription termination sequence will usually be located 3' to the coding 
sequence. 

5 Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, terminators, and the like, that 
provide for the expression of a coding sequence in a host cell. 

A "promoter sequence" is a DNA regulatory region capable of binding RNA 
polymerase in a cell and initiating transcription of a downstream (3' direction) coding 

10 sequence. For purposes of defining the present invention, the promoter sequence is 
bounded at its 3' terminus by the transcription initiation site and extends upstream (5* 
direction) to include the minimum number of bases or elements necessary to initiate 
transcription at levels detectable above background. Within the promoter sequence 
will be found a transcription initiation site (conveniently defined by mapping with 

15 nuclease SI), as well as protein binding domains (consensus sequences) responsible 
for the binding of RNA polymerase. Eukaryotic promoters will often, but not always, 
contain "TATA" boxes and "CAT" boxes. Prokaryotic promoters contain Shine- 
Dalgarno sequences in addition to the -10 and -35 consensus sequences. 

An "expression control sequence" is a DNA sequence that controls and regulates the 
20 transcription and translation of another DNA sequence. A coding sequence is "under 
the control" of transcriptional and translational control sequences in a cell when RNA 
polymerase transcribes the coding sequence into mRNA, which is then translated into 
the protein encoded by the coding sequence. 



25 



A "signal sequence" can be included before the coding sequence. This sequence 
encodes a signal peptide, N-terminal to the polypeptide, that communicates to the host 
cell to direct the polypeptide to the cell surface or secrete the polypeptide into the 
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media, and this signal peptide is clipped off by the host cell before the protein leaves 
the cell. Signal sequences can be found associated with a variety of proteins native to 
prokaryotes and eukaryotes. 

The term "oligonucleotide," as used generally herein, such as in referring to probes 
5 prepared and used in the present invention, is defined as a molecule comprised of two 
or more ribonucleotides, preferably more than three. Its exact size will depend upon 
many factors which, in turn, depend upon the ultimate function and use of the 
oligonucleotide. 

The term "primer" as used herein refers -to an oligonucleotide, whether occurring 
10 naturally as in a purified restriction digest or produced synthetically, which is capable 
of acting as a point of initiation of synthesis when placed under conditions in which 
synthesis of a primer extension product, which is complementary to a nucleic acid 
strand, is induced, i.e.. in the presence of nucleotides and an inducing agent such as a 
DNA polymerase and at a suitable temperature and pH. The primer may be either 
1 5 single-stranded or double-stranded and must be sufficiently long to prime the 

synthesis of the desired extension product in the presence of the inducing agent. The 
exact length of the primer will depend upon many factors, including temperature, 
source of primer and use of the method. For example, for diagnostic applications, 
depending on the complexity of the target sequence, the oligonucleotide primer 
20 typically contains 15-25 or more nucleotides, although it may contain fewer 
nucleotides. 

The primers herein are selected to be "substantially" complementary to different 
strands of a particular target DNA sequence. This means that the primers must be 
sufficiently complementary to hybridize with their respective strands. Therefore, the 
25 primer sequence need not reflect the exact sequence of the template. For example, a 
non-complementary nucleotide fragment may be attached to the 5' end of the primer, 
with the remainder of the primer sequence being complementary to the strand. 
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Alternatively, non-complementary bases or longer sequences can be interspersed into 
the primer, provided that the primer sequence has sufficient complementarity with the 
sequence of the strand to hybridize therewith and thereby form the template for the 
synthesis of the extension product. 

5 As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer 
to bacterial enzymes, each of which cut double-stranded DNA at or near a specific 
nucleotide sequence. 

A cell has been "transformed" by exogenous or heterologous DNA when such DNA 
has been introduced inside the cell. The transforming DNA may or may not be 

10 integrated (covalently linked) into chromosomal DNA making up the genome of the 
cell. In prokaryotes. yeast, and mammalian cells for example, the transforming DNA 
may be maintained on an episomal element such as a plasmid. With respect to 
eukaryotic cells, a stably transformed cell is one in which the transforming DNA has 
become integrated into a chromosome so that it is inherited*by daughter cells through 

15 chromosome replication. This stability is demonstrated by the ability of the 

eukaryotic cell to establish cell lines or clones comprised of a population of daughter 
cells containing the transforming DNA. A "clone" is a population of cells derived 
from a single cell or common ancestor by mitosis. A "cell line" is a clone of a 
primary cell that is capable of stable growth in vitro for many generations. 

20 Two DNA sequences are "substantially homologous" when at least about 75% 

(preferably at least about 80%, and most preferably at least about 90 or 95%) of the 
nucleotides match over the defined length of the DNA sequences. Sequences that are 
substantially homologous can be identified by comparing the sequences using 
standard software available in sequence data banks, or in a Southern hybridization 

25 experiment under, for example, stringent conditions as defined for that particular 
system. Defining appropriate hybridization conditions is within the skill of the art. 
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See, e.g., Maniatis et al., supra; DNA Cloning, Vols. I & II, supra; Nucleic Acid 
Hybridization, supra. 

It should be appreciated that also within the scope of the present invention are DNA 
sequences encoding T.th. DNA Polymerase III which code for a T.th. DNA 
5 Polymerase III having the same amino acid sequence as SEQ ID NO:2, but which are 
degenerate to SEQ ID NO:2. By "degenerate to" is meant that a different three-letter 
codon is used to specify a particular amino acid. It is well known in the art that the 
following codons can be used interchangeably to code for each specific amino acid: 



Phenylalanine (Phe or F) 

1 0 Leucine (Leu or L) 
Isoleucine (He or I) 
Methionine (Met or M) 
Valine (Val or V) 
Serine (Ser or S) 

15 Proline (Pro or P) 

Threonine (Thr or T) ACU 
Alanine (Ala or A) 
Tyrosine (Tyr or Y) 
Histidine (His or H) 

20 Glutamine (Gin or Q) CAA 
Asparagine (Asn or N) 
Lysine (Lys or K) 
Aspartic Acid (Asp or D) 
Glutamic Acid (Glu or E) 

25 Cysteine (Cys or C) 
Arginine (Arg or R) 
Glycine (Gly or G) 
Tryptophan (Trp or W) 



UUU or UUC 

UUA or UUG or CUU or CUC or CUA or CUG 

AUU or AUC or AUA 

AUG 

GUU or GUC of GUA or GUG 

UCU or UCC or UCA or UCG or AGU or AGC 

CCU or CCC or CCA or CCG 
or ACC or ACA or ACG 

GCU or GCG or GCA or GCG 

UAU or UAC 

CAU or CAC 
or CAG 

AAU or AAC 

AAA or AAG 

GAU or GAC 

GAA or GAG 

UGU or UGC 

CGU or CGC or CGA or CGG or AGA or AGG 

GGU or GGC or GGA or GGG 

UGG 
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Termination codon UAA (ochre) or UAG (amber) or UGA (opal) 

It should be understood that the codons specified above are for RNA sequences. The 
corresponding codons for DNA have a T substituted for U. 

5 Mutations can be made, e.g. in SEQ ID NO: 1, or any of the nucleic acids set forth 
herein, such that a particular codon is changed to a codon which codes for a different 
amino acid. Such a mutation is generally made by making the fewest nucleotide 
changes possible. A substitution mutation of this sort can be made to change an 
amino acid in the resulting protein in a non-conservative manner (i.e., by changing the 

10 codon from an amino acid belonging to a grouping of amino acids having a particular 
size or characteristic to an amino acid belonging to another grouping) or in a 
conservative manner (i.e., by changing the codon from an amino acid belonging to a 
grouping of amino acids having a particular size or characteristic to an amino acid 
belonging to the same grouping). Such a conservative change generally leads to less 

15 change in the structure and function of the resulting protein. A non-conservative 

change is more likely to alter the structure, activity or function of the resulting protein. 
The present invention should be considered to include sequences containing 
conservative changes which do not significantly alter the activity or binding 
characteristics of the resulting protein. 

20 The following is one example of various groupings of amino acids: 

Amino acids with nonpolar R groups 

Alanine 
Valine 
Leucine 
25 Isoleucine 
Proline 
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Phenylalanine 

Tryptophan 

Methionine 

Amino acids with uncharged polar R groups 

5 Glycine 
Serine 
Threonine 
Cysteine 
Tyrosine 
10 Asparagine 
Glutamine 

Amino acids with charged polar R groups (negatively charged at ph 6.0) 

Aspartic acid 
Glutamic acid 

1 5 Basic amino acids (positively charged at pH 6.0) 

Lysine 
Arginine 

Histidine (at pH 6.0) 

Another grouping may be those amino acids with phenyl groups: 



20 Phenylalanine 
Tryptophan 
Tyrosine 
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Another grouping may be according to molecular weight (i.e., size of R groups): 
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Methionine 
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Histidine (at pH 6.0) 
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Phenylalanine 




165 


Arginine 




174 


Tyrosine 




181 


Tryptophan 




204 



Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be maintained; 

- Glu for Asp and vice versa such that a negative charge may be maintained; 
25 - Ser for Thr such that a free -OH can be maintained; and 

- Gin for Asn such that a free NH 2 can be maintained. 
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Amino acid substitutions may also be introduced to substitute an amino acid with a 
particularly preferable property. For example, a Cys may be introduced into a 
potential site for disulfide bridges with another Cys. A His may be introduced as a 
particularly "catalytic" site (i.e., His can act as an acid or base and is the most 
5 common amino acid in biochemical catalysis). Pro may be introduced because of its 
particularly planar structure, which induces P-turns in the protein's structure. 

Two amino acid sequences are "substantially homologous" when at least about 70% 
of the amino acid residues (preferably at least about 80%, and most preferably at least 
about 90 or 95%) are identical, or represent conservative substitutions. 

1 0 A "heterologous" region of the DNA construct is an identifiable segment of DNA 

within a larger DNA molecule that is not found in association with the larger molecule 
in nature. Thus, when the heterologous region encodes a mammalian gene, the gene 
will usually be flanked by DNA that does not flank the mammalian genomic DNA in 
the genome of the source organism. Another example of a-heterologous coding 

1 5 sequence is a construct where the coding sequence itself is not found in nature (e.g., a 
cDNA where the genomic coding sequence contains introns, or synthetic sequences 
having codons different than the native gene). Allelic variations or naturally- 
occurring mutational events do not give rise to a heterologous region of DNA as 
defined herein. 

20 An "antibody" is any immunoglobulin, including antibodies and fragments thereof, 
that binds a specific epitope. The term encompasses polyclonal, monoclonal, and 
chimeric antibodies, the last mentioned described in further detail in U.S. Patent Nos. 
4,816,397 and 4,816.567. 

An "antibody combining site" is that structural portion of an antibody molecule 
25 comprised of heavy and light chain variable and hypervariable regions that 
specifically binds antigen. 
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The phrase "antibody molecule" in its various grammatical forms as used herein 
contemplates both an intact immunoglobulin molecule and an immunologically active 
portion of an immunoglobulin molecule. 

Exemplary antibody molecules are intact immunoglobulin molecules, substantially 
5 intact immunoglobulin molecules and those portions of an immunoglobulin molecule 
that contains the paratope, including those portions known in the art as Fab, Fab', 
F(ab t ) 2 and F(v), which portions are preferred for use in the therapeutic methods 
described herein. 

Fab and F(ab f ) 2 portions of antibody molecules are prepared by the proteolytic 
10 reaction of papain and pepsin, respectively, on substantially intact antibody molecules 
by methods that are well-known. See for example. U.S. Patent No. 4,342,566 to 
Theofilopolous et al v Fab' antibody molecule portions are also well-known and are 
produced from F(ab') : portions followed by reduction of the disulfide bonds linking 
the two heavy chain portions as with mercaptoethanol, and followed by alkylation of 
15 the resulting protein mercaptan with a reagent such as iodoacetamide. An antibody 
containing intact antibody molecules is preferred herein. 

The phrase "monoclonal antibody" in its various grammatical forms refers to an 
antibody having only one species of antibody combining site capable of 
immunoreacting with a particular antigen. A monoclonal antibody thus typically 
20 displays a single binding affinity for any antigen with which it immunoreacts. A 
monoclonal antibody may therefore contain an antibody molecule having a plurality 
of antibody combining sites, each immunospecific for a different antigen; e.g., a 
bispecific (chimeric) monoclonal antibody. 

A DNA sequence is "operatively linked" to an expression control sequence when the 
25 expression control sequence controls and regulates the transcription and translation of 
that DNA sequence. The term "operatively linked" includes having an appropriate 
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start signal (e.g., ATG) in front of the DNA sequence to be expressed and maintaining 
the correct reading frame to permit expression of the DNA sequence under the control 
of the expression control sequence and production of the desired product encoded by 
the DNA sequence. If a gene that one desires to insert into a recombinant DNA 
5 molecule does not contain an appropriate start signal, such a start signal can be 
inserted in front of the gene. 

The term "standard hybridization conditions" refers to salt and temperature conditions 
substantially equivalent to 5 x SSC and 65 °C for both hybridization and wash. 
However, one skilled in the art will appreciate that such "standard hybridization 
conditions" are dependent on particular conditions including the concentration of 
sodium and magnesium in the buffer, nucleotide sequence length and concentration, 
percent mismatch, percent formamide, and the like. Also important in the 
determination of "standard hybridization conditions" is whether the two sequences 
hybridizing are RNA-RNA, DNA-DNA or RNA-DNA. Such standard hybridization 
conditions are easily determined by one skilled in the art according to well known 
formulae, wherein hybridization is typically 10-20°C below the predicted or 
determined T m with washes of higher stringency, if desired. 

In its primary aspect, the present invention concerns the identification of a class of 
DNA Polymerase Ill-type enzymes or complexes found in thermophilic bacteria such 
20 as Thermus thermophilics (T.th), and other eubacteria such as Thermatoga, which 
exhibit the following characteristics, among their properties: the ability to extend a 
primer over a long stretch of ssDNA at elevated temperature, stimulation by its 
cognate sliding clamp of the type that is assembled on DNA by a clamp loader {e.g. y 
complex), accessory subunits that exhibit DNA-stimulated ATPase activity at elevated 
25 temperature and/or ionic strength, and an associated 3'-5' exonuclease activity. In a 
particular aspect, the invention extends to Polymerase Ill-type enzymes derived from 
a broad class of thermophilic bacteria that include 
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polymerases isolated from the thermophilic bacteria Thermus thermophilus (TJh 
polymerase), Thermococcus litoralis (77/ or VENT rM polymerase), Pyrococcus 
furiosus {Pfa or DEEP VENT polymerase), Pyrococcus woosii (Pwo polymerase) and 
other Pyrococcus species, Bacillus sterothermophilus (Est polymerase), sulfolobus 
5 acidocaldarius (Sac polymerase), thermoplasma acidophilum (Tac polymerase), 
Thermus favus (Tfl/Tub polymerase), Thermus ruber (Tru polymerase). Thermus 
brockianus (DYNAZYME™ polymerase), Thermotoga neapolitana (Trie polymerase; 
See WO 96/10640). Thermotoga maritima (Tma polymerase; See U.S. Patent No. 
5,374,553) and other species of the Thermotoga genus (Tsp polymerase) and 
10 Methanobacterium thermoautotrophicum (Mth polymerase). The particular 

polymerase discussed herein by way of -illustration and not limitation, is the enzyme 
derived from T th. . 

Polymerase Ill-type enzymes covered by the invention include those that may be 
prepared by purification from cellular material, as described in detail in Example 9 

15 herein, as well as enzyme assemblies or complexes that comprise the combination of 
individually prepared enzyme subunits or components. Accordingly, the entire 
enzyme may be prepared by purification from cellular material, or may be constructed 
by the preparation of the individual components and their assembly into the functional 
enzyme. A representative and non-limitative protocol for the preparation of an 

20 enzyme by this latter route is set forth in U.S. Patent No. 5,583,026, issued December 
10, 1996, to one of the inventors herein, and the disclosure thereof is incorporated 
herein in its entire ty for such purpose. 

Likewise, individual subunits may be modified, e.g. as by incorporation therein of 
single residue substitutions to create active sites therein, for the purpose of imparting 
25 new or enhanced properties to enzymes containing the modified subunits. See, for 
example, Tabor, S. et al. (1995) Proc. Natl Acad. Set USA, 92(14):6339-6343, the 
disclosure of which is also incorporated herein in its entirety. Likewise, individual 
subunits prepared in accordance with the invention, may be used individually and for 
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example, may be substituted for their counterparts in other enzymes, to improve or 
particularize the properties of the resultant modified enzyme. Such modifications are 
within the skill of the art and are considered to be included within the scope of the 
present invention. 

5 

Accordingly, the invention includes the various subunits that may comprise the 
enzymes, and accordingly extends to the genes and corresponding proteins that may 
be encoded thereby, such as the a, 8, y, e, x 9 6 and 6' subunits, respectively. More 
particularly, the a subunit corresponds to dnaE, the 8 subunit corresponds to dnaN, 
10 the e subunit corresponds to dnaQ, and the y and x subunits correspond to dnaX. 



Accordingly, the Polymerase Ill-type enzyme of the present invention comprises at 
least one gene encoding a sub unit thereof, which gene is selected from the group 
consisting of dua X. dua Q. dua E and dua N, and combinations threof. More 
particularly, the invention extends to the nucleic acid molecule encoding them and t 

15 subunits, and includes the dua X gene which has a nucleotide sequence as set forth in 
SEQ ID NO. 3, as well as conserved variants, active fragments and analogs thereof. 
Likewise, the nucleotide sequences encoding the a sub unit (dna e gene). The e sub 
unit (dnaQ gene) and the P sub unit (dna N gene) each comprise the nucleo 
sequences as set forth respectively, in SEQ ID NO'S: 94; 86 and 106. as well as 

20 conserved variants, active fragments and analogs thereof. 



A particular Polymerase Ill-type enzyme in accordance with the invention may 
include at least one of the following sub-units: 

A. ay subunit having an amino acid sequence selected from the formula 
set forth in SEQ ID NOS:4 and 5; 
25 B. a r subunit having an amino acid sequence corresponding to the 

formula set forth in SEQ ID NO:2; 

C. a € subunit having an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO:95; 
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D. a a subunit including an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO: 87; 

E. a B subunit having an amino acid sequence corresponding to the 
formula set forth in SEQ ID NO: 107; and 

5 F. combinations of the above. 

The invention also includes and extends to the use and application of the enzyme 
and/or one or more of its components for DNA molecule amplification and 
sequencing by the methods set forth hereinabove, and in greater detail later on herein. 



10 One of the subunits of the invention is the y/x subunit encoded by a dnaX gene, which 
frameshifts as much as -2 with high efficiency, and that, upon frameshifting, leads to 
the addition of more than one extra amino acid residue to the C-terminus (to form the 
y subunit). Further, the invention likewise extends to a JwoXgene derived from a 
thermophile such as T.th.^ that possesses the frameshift defined herein and that codes 

15 for expression of the v and x subunits of DNA Polymerase III. 

The present invention provides methods for amplifying or sequencing a nucleic acid 
molecule comprising contacting the nucleic acid molecule with a composition 
comprising a DNA polymerase III enzyme (DNA pol III) complex, preferably a DNA 
pol III complex that is substantially reduced in 3-5* exonuclease activity. DNA pol III 
20 complexes used in the methods of the present invention are thermostable. 

The invention also provides DNA molecules amplified by the present methods, 
methods of preparing a recombinant vector comprising inserting a DNA molecule 
amplified by the present methods into a vector, which is preferably an expression 
vector, and recombinant vectors prepared by these methods. 
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The invention also provides methods of preparing a recombinant host cell comprising 
inserting a DNA molecule amplified by the present methods into a host cell, which 
preferably a bacterial cell, most preferably an Escherichia coli cell; a yeast cell; or an 
animal cell most preferably an insect cell, a nematode ceil or a mammalian cell. The 
5 invention also provides and recombinant host cells prepared by these methods. 

In additional preferred embodiments, the present invention provides kits for 
amplifying or sequencing a nucleic acid molecule. DNA amplification kits according 
to the invention comprise a carrier means having in close confinement therein two or 
more container means, wherein a first container means contains a DNA polymerase III 

10 enzyme complex and a second container means contains a deoxynucleoside 

triphosphate. DNA sequencing kits according to the present invention comprise a 
multi-protein Pol Ill-type enzyme complex and a second container means contains a 
dideoxynucleoside triphosphate. The DNA pol III contained in the container means of 
such kits is preferably substantially reduced in 5'-3' exonuclease activity, may be 

15 thermostable, and may be isolated from the thermophilic cellular sources described 
above. Most preferably, the DNA pol III contained in the container means of such 
kits is a DNA polymerase Ill-type complex of a thermophile which lacks the e 
subunit. 

DNA pol Ill-type enzyme complexes for use in the present invention may be isolated 
20 from any organism that produced the DNA pol Ill-type enzyme complexes naturally 
or recombinantly. Such enzyme complexes may be thermostable, isolated from a 
variety of thermophilic organisms. 

The thermostable DNA polymerase Ill-type enzymes or complexes that are an 
important aspect of this invention, may be isolated from a variety of thermophilic 
25 bacteria that are available commercially (for example, from American Type Culture 
Collection, Rockville, Maryland). Suitable for use as sources of thermostable 
enzymes are the thermophilic bacteria Thermus aquaticus, Thermus thermophilics, 
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Thermococcus litoralis. Pyrococcits furiosus, Pyrococcus woosii and other species of 
the Pyrococcus genus. Bacillus stearothermophilus, Sulfolobus acidocaldarius, 
Thermoplasma acidophilum, Thermus flavus, Thermus ruber, Thermus brockianus, 
Thermotoga neapolitana. Thermotoga maritima and other species of the Thermotoga 
5 genus, and Methanobacterium thermoautotrophicum, and mutants of each of these 
species. It will be understood by one of ordinary skill in the art , however, that any 
thermophilic microorganism might be used as a source of thermostable DNA pol Ill- 
type enzymes and polypeptides for use in the methods of the present invention. 
Bacterial cells may be grown according to standard microbiological techniques, using 

1 0 culture media and incubation conditions suitable for growing active cultures of the 
particular thermophilic species that are well-known to one of ordinary skill in the art 
(see, e.g., Brock, T.D., and Freeze, H., X Bacteriol. 98(l):289-297 (1969); Oshima, 
T., and Imahori, K, Int. J. Syst. Bacteriol. 24(1): 102-1 12(1974)). Thermostable DNA 
pol III complexes may then be isolated from such thermophilic cellular sources as 

1 5 described for thermolabile complexes above. 

As stated above and in accordance with the present invention, nucleic acid molecules 
may be amplified according to any of the literature-described manual or automated 
amplification methods. Such methods includes, but are not limited to, PCR (U.S. 
Patent Nos. 4,683 J95 and 4,683,202), Strand Displacement Amplification (SDA; 
20 U.S. Patent No. 5,455,166; EP 0 684 315), and Nucleic Acid Sequence-Based 

Amplification (NASBA; U.S. Patent No. 5,409,818; EP 0 329 822). Most preferably, 
nucleic acid molecules are amplified by the methods of the present invention using 
PCR-based amplification techniques. 

In the initial steps of each of these amplification methods, the nucleic acid molecule to 
25 be amplified is contacted with a composition comprising a DNA polymerase 

belonging to the evolutionary "family A" class {e.g. Tag DNA pol I or E. coli pol I) or 
the "family "B" class {e.g. Vent and Pfu DNA polymerases ™ see Ito, J., and 
Braithwaite, D., Nuci Acids Res. 19(15):4045-4057 (1991)). All of these DNA 
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polymerases are present as single subunits and are primarily involved in DNA repair. 
In contrast, the DNA pol Ill-type enzymes are multisubunit complexes that mainly 
function in the replication of the chromosome, and the subunit containing the DNA 
polymerase activity is in the "family C" class. 

Thus, in amplifying a nucleic acid molecule according to the methods of the present 
invention, the nucleic acid molecule is contacted with a composition comprising a 
thermostable DNA pol Ill-type enzyme complex. The DNA pol Ill-type complexes 
used in the present methods are preferably substantially reduced in 3*-5' exonuclease 
activity (/.<?., they are "exo-"). 

Once the nucleic acid molecule to be amplified is contacted with the DNA pol Ill-type 
complex, the amplification reaction may proceed according to standard protocols for 
each of the above-described techniques. Since most of these techniques comprise a 
high-temperature denaturation step, if a thermolabile DNA pol Ill-type enzyme 
complex (such as E. coli DNA pol III exo-) is used in nucleic acid amplification by 
any of these techniques the enzyme would need to be added at the start of each 
amplification cycle, since it would be heat-inactivated at the denaturation step. 
However, a thermostable DNA pol Ill-type complex used in these methods need only 
be added once at the start of the amplification (as for Tag DNA polymerase in 
traditional PCR amplifications), as its activity will be unaffected by the high 
temperature of the denaturation step. It should be noted, however, that because DNA 
pol Ill-type enzymes have a much more rapid rate of nucleotide incorporation than the 
polymerases commonly used in these amplification techniques, the cycle times may 
need to be adjusted to shorter intervals than would be standard. 

In an alternative preferred embodiment, the invention provides methods of extending 
primers for several kilobases, a reaction that is central to amplifying large nucleic acid 
molecules, by a technique commonly referred to as "long PCR" (Barnes, W.M., Proc. 
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Natl. Acad. ScL USA 91:2216-2220 (1994); Cheng, S. et aL Proc. Natl Acad. Scl 
USA 91:5659-5699 (1994)). 



In such a method the target primed DNA can contain a single strand stretch of DNA to 
be copied into the double strand form of several or tens of kilobases. The reaction is 
5 performed in a suitable buffer, preferably Tris, at a pH of between 5.5 - 9.5, preferably 
7.5. The reaction also contains MgCU in the range 1 mM to 10 mM, preferably 8 
mM, and may contain a suitable salt such as NaCl, KC1 or sodium or potassium 
acetate. The reaction also contains ATP in the range of 20 uM to 1 mM, preferably 
0.5 mM, that is needed for the clamp loader to assemble the clamp onto the primed 

10 template, and a sufficient concentration of deoxynucleoside triphosphates in the range 
of 50(aM to 0.5 mM. preferably 60 \iM for chain extension. The reaction contains a 
sliding clamp, such as the 13 subunit, in the range of 20ng to 200 ng, preferably 100 
ng, for action as a clamp to stimulate the DNA polymerase. The chain extension 
reaction contains a DNA polymerase and a clamp loader, that could be added either 

1 5 separately or as a single Pol III* -like particle, preferably as a Pol III* like particle that 
contains the DNA polymerase and clamp loading activities. The Pol Ill-type enzyme 
is added preferably at a concentrations of about 0.0002-200 units per milliliter, about 
0.002-100 units per milliliter, about 0.2-50 units per milliliter, and most preferably 
about 2-50 units per milliliter. The reaction is incubated at elevated temperature, 

20 preferably 60 °C or more, and could include other proteins to enhance activity such as 
a single strand DNA binding protein. 



In another preferred embodiment, the invention provides methods of extending 
primers on linear templates in the absence of the clamp loader. In this reaction, the 
primers are annealled to the linear DNA, preferably at the ends such as in standard 
25 PCR applications. The reaction is performed in a suitable buffer, preferably Tris, at a 
pH of between 5.5-9.5, preferably 7.5. The reaction also contains MgCU in the range 
of 1 mM to 10 mM. preferably 8 mM, and may contain a suitable salt such as NaCl, 
KC1 or sodium or potassium acetate. The reaction also contains a sufficient 
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concentration of deoxynucieoside triphosphates in the range of 50(iM to 0.5 mM, 
preferably 60 |iM for chain extension. The reaction contains a sliding clamp, such as 
the 6 subunit, in the range of 20ng to 20 |Lig, preferably 7 \xg, for ability to slide on the 
end of the DNA and associate with the polymerase for action as a clamp to stimulate 
the DNA polymerase. The chain extension reaction also contains a Pol Ill-type 
polymerase subunit such as a, core, or a Pol IIP -like particle. The Pol Ill-type 
enzyme is added preferably at a concentrations of about 0.0002-200 units per 
milliliter, about 0.002-100 units per milliliter, about 0.2-50 units per milliliter, and 
most preferably about 2-50 units per milliliter. The reaction is incubated at elevated 
temperature, preferably 60 °C or more, and could include other proteins to enhance 
activity such as a single strand DNA binding protein. 

The methods of the present invention thus will provide high-fidelity amplified copies 
of a nucleic acid molecule in a more rapid fashion than traditional amplification 
methods using the repair-type enzymes. 

These amplified nucleic acid molecules may then be manipulated according to 
standard recombinant DNA techniques. For example, a nucleic acid molecule 
amplified according to the present methods may be inserted into a vector, which is 
preferably an expression vector, to produce a recombinant vector comprising the 
amplified nucleic acid molecule. This vector may then be inserted into a host cell, 
where it may, for example, direct the host cell to produce a recombinant polypeptide 
encoded by the amplified nucleic acid molecule. Methods for inserting nucleic acid 
molecules into vectors, and inserting these vectors into host cells, are well-known to 
one of ordinary skill in the art (see, e.g., Maniatis, T., et al f Molecular Cloning, A 
Laboratory Manual Boca Raton, Florida: CRC Press (1992)). 

Alternatively, the amplified nucleic acid molecules may be directly inserted into a 
host cell, where it may be incorporated into the host cell genome or may exist as an 
extrachromosomal nucleic acid molecule, thereby producing a recombinant host cell. 
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Methods for introduction of a nucleic acid molecule into a host cell, including calcium 
phosphate transfection. DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation. transduction, infection or other methods, are described in 
many standard laboratory manuals (see e.g. .Davis et al., Basic Methods In Molecular 
5 Biology (1986)). 

For each of the above techniques wherein an amplified nucleic acid molecule is 
introduced into a host cell via a vector or via direct introduction, preferred host cells 
include but are not limited to a bacterial cell, a yeast cell, or an animal cell. Bacterial 
host cells preferred in the present invention are K coli t Bacillus spp., Streptomyces 

10 spp., Erwinia spp., Klebsiella spp. and Salmonella typhimurium. Preferred as a host 
cell is E. coli f and particularly preferred are E. coli strians DH10B and Stbl2, which 
are available commercially (Life Technologies, Inc. Gaitherburg, Maryland). 
Preferred animal host cells are insect cells, nematode cells and mammalian cells. 
Insect host cells preferred in the present invention are Drosophila spp. cells, 

15 Spodoptera Sf9 and Sf21 cells, and Trichoplusa High-Five cells, each of which is 
available commercially (e.g., from Invitrogen; San Diego, California). Preferred 
nematode host cells are those derived from C elegans, and preferred mammalian host 
cells are those derived from rodents, particularly rats, mice or hamsters, and primates, 
particularly monkeys and humans. Particularly preferred as mammalian host cells are 

20 CHO cells, COS cells and VERO cells. 

By the present invention, nucleic acid molecules may be sequenced according to any 
of the literature-described manual or automated sequencing methods. Such methods 
include, but are not limited to, dideoxy sequencing methods ("Sanger sequencing"; 
Sanger, F., and Coulson, A.R., J. Mol Biol 94:444-448 (1975); Sanger, F., et al., 
25 Proc. Natl Acad. Scl USA 74:5463-5467 (1977); U.S. Patent Nos. 4,962,022 and 
5,498,523), as well as more complex PCR-based nucleic acid fingerprinting 
techniques such as Random Amplified Polymorphic DNA 9RAPD) analysis 
(Williams, J.G.K. et al, Nucl Acids Res. 18(22):653 1-6535, 1990). Arbitrarily 
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Primed PCR (AP-PCR; Welsh, J., and McClelland, M., NucL Acids Res, 18(24):7213- 
7218, 1990), DNA Amplification Fingerprinting (DAF; Caetano-Anolles et al., 
Bio/Technology 9:553-557, 1991), microsatellite PCR or Directed Amplification of 
Minisatellite-region DNA (DAMD; Heath, D.D., et aL, NucL Acids Res. 21(24):5782- 
5 5785, 1993), and Amplification Fragment Length Polymorphism (AFLP) analysis (EP 
0 534 858; Vos. P., ctal., NucL Acids Res, 23(21):4407-4414, 1995; Lin, J.J., and 
Kuo, J., FOCUS 17(2):66-70, 1995). 

As described above for amplification methods, the nucleic acid molecule to be 
sequenced by these methods is typically contacted with a composition comprising a 

10 type A or type B DNA polymerase. By contrast, in sequencing a nucleic acid 

molecule according to the methods of the present invention, the nucleic acid molecule 
is contacted with a composition comprising a thermostable DNA pol Ill-type enzyme 
complex instead of necessarily using a DNA polymerase of the family A or B classes. 
As for amplification methods, the DNA pol Ill-type complexes used in the nucleic 

15 acid sequencing methods of the present invention are preferably substantially reduced 
in 5 r -3' exonuclease activity; most preferable for use in the present methods is a DNA 
polymerase Ill-type complex which lacks the z subunit. DNA pol Ill-type complexes 
used for nucleic acid sequencing according to the present methods are used at the 
same preferred concentration ranges described above for long chain extension of 

20 primers. 

Once the nucleic acid molecule to be sequenced is contacted with the DNA pol III 
complex, the sequencing reactions may proceed according to the protocols disclosed 
in the above-referenced techniques. 

As discussed above, the invention extends to kits for use in nucleic acid amplification 
25 or sequencing utilizing DNA polymerase Ill-type enzymes according to the present 
methods. A DNA amplification kit according to the present invention may comprise a 
carrier means, such as vials, tubes, bottles and the like. A first such container means 
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may contain a DNA polymerase Ill-type enzyme complex, and a second such 
container means may contain a deoxynucieoside triphosphate. The amplification kit 
encompassed by this aspect of the present invention may further comprise additional 
reagents and compounds necessary for carrying out standard nucleic amplification 
5 protocols (See U.S. Patent Nos. 4,683, 1 95 and 4,683,202, which are directed to 
methods of DNA amplification by PCR). 

Similarly, a DNA sequencing kit according to the present invention comprises a 
carrier means having in close confinement therein two or more container means, such 
as vials, tubes, bottles and the like. A first such container means may contain a DNA 

10 polymerase Ill-type enzyme complex, and a second such container means may contain 
a dideoxynucleoside triphosphate. The sequencing kit may further comprise 
additional reagents and compounds necessary for carrying out standard nucleic 
sequencing protocols^ such as pyrophosphatase, agarose or polyacrylamide media for 
formulating sequencing gels, and other components necessary for detection of 

15 sequenced nucleic acids (See U.S. Patent Nos. 4,962,020 and 5,498,523. which are 
directed to methods of DNA sequencing). 

The DNA polymerase Ill-type complex contained in the first container means of the 
amplification and sequencing kits provided by the invention is preferably a 
thermostable DNA polymerase Ill-type enzyme complex and more preferably a DNA 

20 polymerase Ill-type enzyme complex that is substantially reduced in 3-5' exonuelease 
activity. Naturally, the foregoing methods and kits are presented as illustrative and 
not restrictive of the use and application of the enzymes of the invention for DNA 
molecule amplification and sequencing. Likewise, the applications of specific 
embodiments of the enzymes, including conserved variants and active fragments 

25 thereof are considered to be disclosed and included within the scope of the invention. 

As discussed earlier, individual subunits could be modified to customize enzyme 
construction and corresponding use and activity. For example, the region of a that 
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interacts with B could be subcloned onto another DNA polymerase, thereby causing 13 
to enhance the activity of the recombinant polymerase. Alternatively, the 6 clamp 
could be modified to function with another protein or enzyme thereby enhancing its 
activity or acting to localize its action to a particular targeted DNA. Finally, the 
5 polymerase active site could be modified to enhance its action, sor example changing 
Tyrosine enabling more equal site stoppage with the four ddNTPs (Tabor et al. 1995). 
This represents a particular non-limiting illustration of the scope and practice of the 
present invention with reference to the utility of individual subunits hereof. 

Accordingly and as stated above, the present invention also relates to a recombinant 
DNA molecule or cloned gene, or a degenerate variant thereof, which encodes any 
one or all of the subunits of the DNA Polymerase Ill-type enzymes of the present 
invention, or active fragments thereof. In the instance of the t subunit, a predicted 
molecular weight of about 58 kD and an amino acid sequence set forth in SEQ ID 
NOS:4 or 5 is comprehended; preferably a nucleic acid molecule, in particular a 
recombinant DNA molecule or cloned gene, encoding the $8 kD subunit of the 
Polymerase III of the invention, that has a nucleotide sequence or is complementary to 
a DNA sequence shown in FIGURES 4A and 4B (SEQ ID NO: 1), and the coding 
region for dnaX set forth in FIGURE 4C (SEQ ID NO:3). The y subunit is smaller, 
and is approximately 50 kD, depending upon the extent of the frameshift that occurs. 
More particularly, and as set forth in FIGURE 4E (SEQ ID NO:4), the y subunit 
defined by a -1 frameshift possesses a molecular weight of 50.8 kD, while the y 
subunit defined by a -2 frameshift, set forth in FIGURE 4F (SEQ ID NO:5), possesses 
a molecular weight of 49.8 kD. 

As discussed above, the invention also extends to the genes including dnaX, dnaO, 
25 dnaE and dnaN. that have been isolated and purified from Thermus thermophilics, to 
corresponding vectors for the genes, and particularly, to the vectors pETdnaXand 
pETdnaN, and to host cells including such vectors. In this connection, probes have 
been prepared which hybridize to the DNA polymerase Ill-type enzymes of the 
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present invention, and which are selected from the group consisting of the 
oligonucleotide defined in SEQ ID NO:6; the oligonucleotide defined in SEQ ID 
NO:8; the oligonucleotide defined in SEQ ID NO: 10; the oligonucleotide defined in 
SEQ ID NO:l 1; the oligonucleotide defined in SEQ ID NO: 12; the oligonucleotide 
5 defined in SEQ ID NO:13; the oligonucleotide defined in SEQ ID NO:14; the 

oligonucleotide defined in SEQ ID NO: 15, and the oligonucleotide defined in SEQ ID 
NO:16. 

The methods of the invention include a method for producing a recombinant 
thermostable DNA polymerase Ill-type enzyme from a thermophilic bacterium such 

10 as Thermits thermophilics which comprises culturing a host cell transformed with a 
vector of the invention under conditions suitable for the expression of the present 
DNA polymerase III. Another method includes a method for isolating a target DNA 
fragment consisting essentially of a DNA coding for a thermostable DNA polymerase 
Ill-type enzyme from a thermophilic bacterium comprising the steps of: 

1 5 (a) forming a genomic library from the bacterium; . 

(b) transforming or transfecting an appropriate host cell with the library of step 

(a); 

(c) contacting DNA from the transformed or transfected host cell with a DNA 
probe which hybridizes to a DNA fragment selected from the group consisting of the 

20 DNA fragments defined in SEQ ID NO:6 and the DNA fragments defined in SEQ ID 

NO:8 or the oligonucleotides set forth above; wherein hybridization is conduction 

under the following conditions: 

i) hybridization: 1% crystalline BSA (fraction V) (Sigma), 1 mM EDTA, 

0.5 M NaHP04 (pH 7.2), 7% SDS at 65°C for 12 hours and; 
25 ii) wash: 5 x 20 minutes with wash buffer consisting of 0.5% BSA, 

fraction V), ImM Na2EDTA, 40 mM NaHP04 (pH 7.2), and 5% SDS; 

(d) assaying the transformed or transfected cell of step (c) which hybridizes to 
the DNA probe for DNA polymerase Ill-type activity; and 
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(e) isolating a target DNA fragment which codes for the thermostable DNA 
polymerase Ill-type enzyme. 

Also, antibodies including both polyclonal and monoclonal antibodies, and the DNA 
Polymerase Ill-like enzyme complex and/or their y and t subunits or a subunit may 
5 be used in the preparation of the enzymes of the present invention as well as other 
enzymes of similar thermophilic origin. For example, the DNA Polymerase Ill-type 
complex or its subunits may be used to produce both polyclonal and monoclonal 
antibodies to themselves in a variety of cellular media, by known techniques such as 
the hybridoma technique utilizing, for example, fused mouse spleen lymphocytes and 
10 myeloma cells. 

The general methodology for making monoclonal antibodies by hybridomas is well 
known. Immortal, antibody-producing cell lines can also be created by techniques 
other than fusion, such as direct transformation of B lymphocytes with oncogenic 
DNA, or transfection with Epstein-Barr virus. See, e.g., Schreier et aL, 
15 "Hybridoma Techniques" (1980); Hammerling et aL, "Monoclonal Antibodies And T- 
cell Hybridomas" (1981); Kennett et aL. "Monoclonal Antibodies" (1980); see also 
U.S. Patent Nos. 4.34L761; 4,399,121; 4.427,783; 4,444,887; 4,451,570; 4,466,917; 
4,472,500; 4,491.632: 4,493,890. 

Methods for producing polyclonal anti-polypeptide antibodies are well-known in the 
20 art. See U.S. Patent No. 4,493,795 to Nestor et aL A monoclonal antibody, typically 
containing Fab and/or F(ab') 2 portions of useful antibody molecules, can be prepared 
using the hybridoma technology described in Antibodies - A Laboratory Manual, 
Harlow and Lane, eds., Cold Spring Harbor Laboratory, New York (1988), which is 
incorporated herein by reference. Briefly, to form the hybridoma from which the 
25 monoclonal antibody composition is produced, a myeloma or other self-perpetuating 
cell line is fused with lymphocytes obtained from the spleen of a mammal 
hyperimmunized with an elastin-binding portion thereof. 
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A monoclonal antibody useful in practicing the present invention can be produced by 
initiating a monoclonal hybridoma culture comprising a nutrient medium containing a 
hybridoma that secretes antibody molecules of the appropriate antigen specificity. 
The culture is maintained under conditions and for a time period sufficient for the 
5 hybridoma to secrete the antibody molecules into the medium. The antibody- 
containing medium is then collected. The antibody molecules can then be further 
isolated by well-known techniques. 

Media useful for the preparation of these compositions are both well-known in the art 
and commercially available and include synthetic culture media, inbred mice and the 
10 like. An exemplary synthetic medium is Dulbecco's minimal essential medium 

(DMEM; Duibecco et al., Virol 8:396 (1959)) supplemented with 4.5 gm/I glucose, 
20 mm glutamine, and 20% fetal calf serum. An exemplary inbred mouse strain is the 
Balb/c. 

Another feature of this invention is the expression of the DNA sequences disclosed 
15 herein. As is well known in the art, DNA sequences may be expressed by operatively 
linking them to an expression control sequence in an appropriate expression vector 
and employing that expression vector to transform an appropriate unicellular host. 

Such operative linking of a DNA sequence of this invention to an expression control 
sequence, of course, includes, if not already part of the DNA sequence, the provision 
20 of an initiation codon, ATG, in the correct reading frame upstream of the DNA 
sequence. 

A wide variety of host/expression vector combinations may be employed in 
expressing the DNA sequences of this invention. Useful expression vectors, for 
example, may consist of segments of chromosomal, non-chromosomal and synthetic 
25 DNA sequences. Suitable vectors include derivatives of S V40 and known bacterial 
plasmids, e.g., E. colt plasmids col El, pCRl, pBR322, pMB9 and their derivatives, 



r 



r 



50 

plasmids such as RP4; phage DNAS, e.g., the numerous derivatives of phage X, e.g., 
NM989, and other phage DNA, e.g., M13 and filamentous single stranded phage 
DNA; yeast plasmids such as the 2\i plasmid or derivatives thereof; vectors useful in 
eukaryotic cells, such as vectors useful in insect or mammalian cells; vectors derived 
5 from combinations of plasmids and phage DNAs, such as plasmids that have been 
modified to employ phage DNA or other expression control sequences; and the like. 

Any of a wide variety of expression control sequences — sequences that control the 
expression of a DNA sequence operatively linked to it — may be used in these vectors 
to express the DNA sequences of this invention. Such useful expression control 

10 sequences include, for example, the early or late promoters of SV40, CMV, vaccinia, 
polyoma or adenovirus, the lac system, the trp system, the TAC system, the TRC 
system, the LTR system, the major operator and promoter regions of phage A, the 
control regions of fd coat protein, the promoter for 3-phosphoglycerate kinase or other 
glycolytic enzymes, the promoters of acid phosphatase (e.g., Pho5), the promoters of 

1 5 the yeast a-mating factors, and other sequences known to control the expression of 
genes of prokaryotic or eukaryotic cells or their viruses, and various combinations 
thereof. 

A wide variety of unicellular host cells are also useful in expressing the DNA 
sequences of this invention. These hosts may include well known eukaryotic and 
20 prokaryotic hosts, such as strains of E. coli, Pseudomonas, Bacillus^ Streptomyces, 
fungi such as yeasts, and animal cells, such as CHO, Rl.l, B-W and L-M cells, African 
Green Monkey kidney cells (e.g., COS 1, COS 7, BSC1, BSC40, and BMT10), insect 
cells (e.g., Sf9), and human cells and plant cells in tissue culture. 

It will be understood that not all vectors, expression control sequences and hosts will 
25 function equally well to express the DNA sequences of this invention. Neither will all 
hosts function equally well with the same expression system. However, one skilled in 
the art will be able to select the proper vectors, expression control sequences, and 
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hosts without undue experimentation to accomplish the desired expression without 
departing from the scope of this invention. For example, in selecting a vector, the 
host must be considered because the vector must function in it. The vector's copy 
number, the ability to control that copy number, and the expression of any other 
5 proteins encoded by the vector, such as antibiotic markers, will also be considered. 

In selecting an expression control sequence, a variety of factors will normally be 
considered. These include, for example, the relative strength of the system, its 
controllability, and its compatibility with the particular DNA sequence or gene to be 
expressed, particularly with regard to potential secondary structures. Suitable 
10 unicellular hosts will be selected by consideration of, e.g., their compatibility with the 
chosen vector, their secretion characteristics, their ability to fold proteins correctly, 
and their fermentation requirements, as well as the toxicity to the host of the product 
encoded by the DNA sequences to be expressed, and the ease of purification of the 
expression products. 

1 5 Considering these and other factors a person skilled in the art will be able to construct 
a variety of vector/expression control sequence/host combinations that will express 
the DNA sequences of this invention on fermentation or in large scale animal culture. 

It is further intended that analogs may be prepared from nucleotide sequences of the 
protein complex/subunit derived within the scope of the present invention. Analogs, 

20 such as fragments, may be produced, for example, by pepsin digestion of bacterial 
material. Other analogs, such as muteins, can be produced by standard site-directed 
mutagenesis of dnaX, dnaE, dnaO or dnaN coding sequences. Especially useful may 
be a mutation in dnaE that provides the polymerase with the ability to incorporate all 
four ddNTPs with equal efficiency thereby producing an even binding pattern in 

25 sequencing gels, as discussed above and with reference to Tabor et al. 1995, supra, . 
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As mentioned above, a DNA sequence corresponding to dnaX, dnaO, dnaE or dnaN, 
or encoding the subunits of the DNA Polymerase III of the invention can be prepared 
synthetically rather than cloned. The DNA sequence can be designed with the 
appropriate codons for the amino acid sequence of the subunit(s) of interest. In 
5 general, one will select preferred codons for the intended host if the sequence will be 
used for expression. The complete sequence is assembled from overlapping 
oligonucleotides prepared by standard methods and assembled into a complete coding 
sequence. See, e.g.. Edge, Nature, 292:756 (1981); Nambair et aL, Science, 223:1299 
(1984); Jay et al., J. Biol. Chem. y 259:63 1 1 (1984). 

1 0 Synthetic DNA sequences allow convenient construction of genes which will express 
DNA Polymerase III analogs or "muteins". Alternatively, DNA encoding muteins can 
be made by site-directed mutagenesis of native dnaX^ dnaO, dnaE or dnaN genes or 
their corresponding cDNAs, and muteins can be made directly using conventional 
polypeptide synthesis. 

1 5 A general method for site-specific incorporation of unnatural amino acids into 

proteins is described in Christopher J. Noren. Spencer J. Anthony-Cahill, Michael C. 
Griffith, Peter G. Schuitz, Science. 244:182-188 (April 1989). This method may be 
used to create analogs with unnatural amino acids. 

GENERAL DESC RIPTION 

20 As discussed above, the present invention has as one of its characterizing features, that 
a Polymerase Ill-type enzyme as defined hereinabove, has been discovered in a 
thermophile, that has the structure and function of a chromosomal replicase. This 
structure and function confers significant benefit when the enzyme is employed in 
procedures such as PCR where speed and accuracy of DNA reconstruction is crucial. 
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Chromosomal replicases are composed of several subunits in all organisms (Kornberg 
and Baker, 1992). In keeping with the need to replicate long chromosomes, replicases 
are rapid and highly processive muitiprotein machines. All cellular replicases 
examined to date derive their processivity from one subunit that is shaped like a ring 
and completely encircles DNA (Kuriyan and O'Donnell, 1993; Kelman and 
O'DonnelL 1994). This "sliding clamp" subunit acts as a mobile tether for the 
polymerase machine (Stukenberg et. al.. 1991). The sliding clamp does not assemble 
onto the DNA by itself, but requires a complex of several proteins, called a "clamp 
loader" which couples ATP hydrolysis to the assembly of sliding clamps onto DNA 
(O'Donnell et. al., 1992). Hence, Pol Ill-type cellular replicases are comprised of 
three components: a clamp, a clamp loader, and the DNA polymerase. 

An overall goal is to identify and isolate all of the genes encoding the replicase 
subunits from a thermophile for expression and purification in large quantity. 
Following this, the replication apparatus can be reassembled from individual subunit 
components for use in kits, PCR, sequencing and diagnostic applications (Onrust et. 
aL, 1995). 

As a beginning to identify and characterize the replicase of a thermophile, we started 
by looking for a homologue to the prokaryotic dnaXgene which encode subunits (y 
and x) of the replicase. The dnaX gene has another homologue, holB, which encodes 
yet another subunit (5 f ) of the replicase. The amino acid sequence of 5' (encoded by 
holA) and x/y subunits (encoded by dnaX) are particularly highly conserved in 
evolution from prokaryotes to eukaryotes ( Chen et. al., 1992; O'Donnell et. al., 1993; 
Onrust et aL, 1993: Carter et. aL, 1993; Cullman et. aL, 1995). 

The organism chosen for study and exposition herein is the exemplary extreme 
thermophile, Thermus thermophilics (T.th.). It is understood that other members of 
the class such as the eubacterium Thermatoga are expected to be analogous in both 
structure and function. Thus, the investigation of T.th. proceeded and initially, a T.th. 
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homologue of dnaX was identified. The gene encodes a full length protein of 529 
amino acids. The amino terminal third of the sequence shares over 50% homology to 
dnaX genes as divergent as E. coli (gram negative) and B. subtilis (gram positive). 
The T.th. dnaX gene contains a DNA sequence that provides a translational frameshift 
5 signal for production of two proteins from the same gene. Such frameshifting has 
been documented only in the case of E. coli (Tsuchihashi and Kornberg, 1990; Flower 
and McHenry, 1990: Blinkowa and Walker, 1990). No frameshifting has been 
documented to occur in the ^oXhomologues (RFC subunit genes) of yeast and 
humans (Eukaryotic kingdom). 

10 The presence of a dnaX gene that produces two subunits implies that T.th has a clamp 
loader (y) and is organized by x into a three component Pol Ill-type replicase. The 
three components of its replicase may be organized into a holoenzyme particle like the 
replicative DNA polymerase of Escherichia coli, DNA polymerase III holoenzyme. 
The E. coli DNA polymerase III holoenzyme contains 10 different subunits, some in 

15 copies of two or more for a total composition of 18 polypeptide chains (Baker and 
Kornberg, 1992; Onrust et. al., 1995). The holoenzyme is composed of three major 
activities: the 3 -subunit DNA polymerase core (ae9), the B subunit DNA sliding 
clamp, and the 5-subunit y complex clamp loader (ySS'xiJ/). This 3 component 
strategy generalizes to eukaryotes which utilize a clamp (PCNA) and a 5-subunit RFC 

20 clamp loader (RFC) which provide processivity to DNA polymerase 6 (reviewed in 
Kelman and O'Donnell, 1994). 

In E, coli, the three components are organized into one holoenzyme particle by the x 
subunit, that acts as a "glue" protein (Onrust and O'Donnell, 1995). One dimer oft 
holds together two core polymerases into one particle which are utilized for the 
25 coordinated and simultaneous replication of both strands of duplex DNA (McHenry, 
1982; Maki et. al., 1988; Yuzhakov et al, 1996). The "glue" protein x subunit also 
binds one clamp loader (called y complex) thereby acting as a scaffold for a large 
superstructure assembly called DNA polymerase III holoenzyme. The gene encoding 
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x, called dnaX, also encodes the y subunit of DNA polymerase III. The p subunit 
then associates with Pol III to form the DNA polymerase III holoenzyme. The y 
subunit is approximately 2/3 the length of t. y shares the N-terminus of t, but is 
truncated by a translational frameshifting mechanism that, after the shift, encounters a 
5 stop codon within two amino acids (Tsuchihashi and Kornberg, 1990; Flower and 
McHenry, 1990; Blinkowa and Walker, 1990). Hence, y is the N-terminal 453 amino 
acids of t, but contains one unique residue at the C-terminus (the penultimate codon 
encodes a Lys residue which is the same sequence as if the frameshift did not take 
place). This frameshift is highly efficient and occurs approximately 50% of the time. 

10 The sequence of the y and x submits encoded by the dnaX gene are homologous to 
the clamp loading subunits in all other organisms extending from gram negative 
bacteria through gram positive bacteria, the Archeae Kingdom and the Eukaryotic 
Kingdom from yeast to humans (O'Donnell et al., 1993). All of these organisms 
utilize a three component replicase (DNA polymerase, clamp and clamp loader) and in 

1 5 these cases the 3 components appear to behave as independent units in solution rather 
than forming a large holoenzyme superstructure. For example, in eukaryotes from 
yeast to humans, the clamp loader is the five subunit RFC, the clamp is PCNA and the 
polymerases 6 and e are all stimulated by the PCNA clamp assembled onto primed 
DNA by RFC (reviewed in Kelman et. al., 1994). 

20 The discovery of a dnaX gene in T.th provided confidence that thermophilic bacteria 
would contain a three component Pol Ill-type enzyme. Hence, we proceeded to 
identify the dnaQ and dnaN genes encoding, respectively, the proofreading 3'-5' 
exonuclease, and the B DNA sliding clamp subunits of a Pol Ill-type enzyme. 
Following this, we purified from extracts of T.th cells, a Pol Ill-type enzyme. This 

25 enzyme preparation had the unique property of extending a single primer around a 
long 7.2 kb single strand DNA genome of Ml 3mp 1 8 bacteriophage. Such a primer 
extension assay serves as a tool to detect and identify the Pol Ill-type of enzyme in 
cell extracts. The enzyme was confirmed to be a Pol Ill-type enzyme based on its 
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reactivity with antibody directed against the E. coli a subunit (the DNA polymerase 
subunit) and antibody directed against E. coli y subunit Proteins corresponding to a, 
t, y* 8 8* were easily visible and will lend themselves to identification of the 
genes through use of peptide microsequencing followed by primer design for PCR 
5 amplification. From this DNA pol Ill-type preparation we obtained peptide sequence 
of the a subunit enabling us to obtain the dnaE gene encoding the a subunit (DNA 
polymerase) of the Pol Ill-type enzyme. 

These methods should be widely applicable to other thermophilic bacteria. Additional 
antibody reaents against other Pol Ill-type enzyme components, such as RFC 

1 0 subunits, DNA polymerase delta, epsilon or beta, and the PCNA clamp from known 
organisms can be made quite easily as polyclonal or monoclonal antibody 
preparations using as antigen either naturally purified sequence, recombinant 
sequence, or synthetic peptide sequence. Examples of known sequences of these Pol 
Ill-type enzymes are to be found in: 1) DNA polymerases (Braithwaite and Ito, 1993), 

15 RFC clamp loaders (Cullman et. AL, 1995), and PCNA (Iefcnan and O'DonnelL 
1995). 

Braithwaite, D.K. and Ito, J. (1993) Compilation, alignment, and phylogenetic 
relationships of DNA polymerases. Nuc. Acids Res. 21, 787-802. 

Cullman G., Fein, K., Kobayashi, R., and Stillman. B. (1995) Characterization of the 
20 five replication factor C genes of Saccharomyces cerevisiae. Mol. Cell. Biol. 15, 
4661-4671. 

Kelman, Z., and CTDonnell, M. (1995) Structural and functional similarities of 
prokaryotic and eukaryotic DNA polymerase sliding clamps. Nucl Acids Res. 23, 
3613-3620. 
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The remaining genes of Pol III needed for efficient extension of primed templates 
should be easy to obtain from the TJk Pol III by similar methods as those described 
herein. These genes will provide the subunit preparations through use of standard 
recombinant techniques and protein purification protocols. The protein subunits can 

5 then be used to reconstitue the enzyme complexes as they exist in the cell. This type 
of reconstitution of Pol III has been demonstrated using the protein subunits of DNA 
polymerase III holoenzyme from E. coli to assemble the entire particle. See e.g., U.S. 
Patent No. 5,583,026. issued December. 1996, O'Donnell, M.E.; and U.S. Patent No. 
5,668,004, issued September, 1997, both to one of the inventors herein, and Onrust et. 

10 al. 1995b. The disclosures of these references are incorporated herein in their 
entireties. 

The following experiments illustrate the identification and characterization of the 
enzymes and constructs of the present invention. Accordingly, in Examples 1-8 
below, the identification and expression of the y and t is presented, as the first step in 
1 5 the elucidation of the Polymerase III reflective of the present invention. Examples 9- 
13 which follow set forth the protocol for the purification of the remainder of the sub- 
units of the enzyme that represent substantial entirety of the functional replicative 
machinery of the enzyme. 

EXAMPLE 1 

20 EXPERIMENTAL PROCEDURES 

Materials - DNA modification enzymes were from New England Biolabs. Labelled 
nucleotides were from Amersham, and unlabeled nucleotides were from New England 
Biolabs The Alter- 1 vector was from Promega. pET plasmids and E. coli strains, 
BL21(DE3) and BL21(DE3)pLysS, were from Novagen. Oligonucleotides were from 
25 Operon. Buffer A is 20mM Tris-HCl (pH 7.5), 0.1 mM EDTA, 5mMDTT, and 10% 
glycerol. 
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Genomic DNA 

Thermus thermophilus (strain HB8) was obtained from the American Type Tissue 
Collection. Genomic DNA was prepared from cells grown in 0.1 1 of (Thermus 
medium N697 (ATCC: 4 y yeast extract, 8.0 g polypeptone (BBL 1 1910), 2.0 g NaCl, 
5 30.0 g agar, 1.0 L distilled water) at 75 °C overnight. Cells were collected by 
centrifiigation at 4°C and the cell pellet was resuspended in 25 ml of 100 mM 
Tris-HCl (pH 8.0). 0.05 M EDTA, 2 mg/ml lysozyme and incubated at room 
temperature for 10 min. Then 25 ml 0.10 M EDTA (pH 8.0), 6% SDS was added and 
mixed followed by 60 ml of phenol. The mixture was shaken for 40 min. followed by 

10 centrifiigation at 10.000 X G for 10 min. at room temperature. The upper phase (50 
ml) was removed and mixed with 50 ml- of phenol xhloroform (50:50 v/v) for 30 min. 
followed by centrifiigation for 10 min. at room temperature. The upper phase was 
decanted and the DNA was precipitated upon addition of 1/1 0th volume 3 M sodium 
acetate (pH 6.5) and 1 volume ethanol. The precipitate was collected by 

1 5 centrifiigation and washed twice with 2 ml of 80% ethanol, dried and resuspended in 1 
ml T.E. buffer (lOmM Tris Hcl (pH 7.5), ImM EDTA). . 

Cloning oidnaX - DNA oligonucleotides for amplification of T.th genomic DNA 
were as follows. The upstream 32mer 

20 (S'-CGC AAGCTT CACGCSTACCTSTTCTCCGGSAC-S 1 ) (S indicates a mixture of 
G and C) consists of a Hind III site within the first 9 nucleotides (underlined) 
followed by codons encoding the following sequence (HAYLFSGT). The 
downstream 34 mer (S'-CGCGAATTCGTGCTCSGGSGGCTCCTCSAGSGTC^') 
consists of an EcoRI site (underlined) followed by codons encoding the sequence 

25 KTLEEPPEH on the complementary strand. The amplification reactions contained 10 
ng T.th genomic DNA, 0.5 mM of each primer, in a volume of 100 jxl of Vent 
polymerase reaction mixture according to the manufacturers instructions (10 ^il 
ThermoPol Buffer, 0.5 mM each dNTP and 0.5 mM MgS0 4 ). Amplification was 
performed using the following cycling scheme: 5 cycles of: 30 s at 95.5 °C, 30 s at 

30 40°C, 2 min. at 72°C; 5 cycles of: 30 s at 95.5 °C, 30 s at 45 °C, and 2 min. at 72°C; 
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and 30 cycles of: 30 s at 95.5°C, 30 s at 50°C, and 30 s at 72°C. Products were 
visualized in a 1 .5 % native agarose gel. 



Genomic DNA was digested with either Xhol, Xbal, StuI, PstI, Ncol, MM, Kpnl, 
Hindlll, EcoRI, EagI, Bgll, or BamHL followed by Southern analysis in a native 
5 agarose gel (Maniatis et. ah, 1982). Approximately 0.5 (ig of digest was analyzed in 
each lane of a 0.8 % native agarose gel followed by transfer to an MSI filter (Micron 
Separations Inc.). The transfer included the following steps: 

1. The agarose gel was soaked in 500 ml of 1% HC1 with gentle shaking for 10 min. 

2. Then the gel was soaked in 500 ml of 0.5 M NaOH + 1.5 M NaCl for 40 min. 
10 3. After that the gel was soaked in 500 ml of 1M ammonium acetate for 1 h. 

4. The DNA was transferred to the MSI filter with the use of blotting paper for 4 h. 

5. The filter was kept at 80°C for 15 min. in the oven. 

6. The pre-hybridization step was run in 10 ml of Hybridization solution (1% 
crystalline BSA (fraction V) (Sigma), 1 mM EDTA. 0.5 M NaHP04 (pH 7.2), 7% 

15 SDS)at65°Cfor30min. 

7. The probe, radiolabeled by the random priming method (see below), was added to 
the pre-hybridization solution and kept at 65 °C for 12 h. 

8. The filter was washed with low stringency with 200 ml of the wash buffer (0.5% 
BSA, fractionV), ImM Na2EDTA, 40 mM NaHP04 (pH 7.2), 5% SDS with gentle 

20 shaking for 20 min. This step was repeated 5 times, followed by exposure to X-ray 
film (XAR-5, Kodak). 



As a probe, the PCR product was radiolabeled by random as follows. 

1. 14 ml of the mixture containing 0.2 [xg of PCR product DNA, 1 ]j.g of the pd(N6) 
(Promega) and 2.5 ml of the 10X Klenow reaction buffer (100 mM Tris-HCl (pH 7.5), 

25 50 mM MgCi 2 , 75 mM dithiothreitol) were boiled for 10 min. and then kept at 4°C . 

2. The reaction volume was increased up to 25 jil, containing in addition 33 ^iM of 
each dNTP, except dATP, 10 jaCi [a- 32 P] dATP (800 Ci/mM), and 2 units of Klenow 
enzyme. The reaction mixture was incubated 1.5 h. 
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3. 2 mg of sonicated herring sperm DNA (GibcoBRL) was added to the reaction and 
the volume was increased to 2 ml using hybridization solution. The sample was then 
boiled for 10 min. 

A genomic library of Xbal digested DNA was prepared upon treating 1 fig genomic 
5 T.th. DNA with 10 units of Xbal in 100 \il of NEBuffer N2 (50 mM NaCl, 10 mM 
Tris-HCl (pH 7.9), 1 0 mM MgC12, 1 mM DTT) for 2 h at 37°C. The digested DNA 
was purified by phenol chloroform extraction and ethanol precipitation. The Alter- 1 
vector (0.5 ng)(Promega) was digested with 1 unit of Xbal in NEBuffer N2 and then 
purified by phenol/chloroform extraction and ethanol precipitation. One microgram 

10 of genomic digest was incubated with 0.05 ^ig of digested Alter- 1 and 20 U of T4 
ligase in 30 fii of ligase buffer (50 mM Tris-HCl (pH 7.8), 10 mM MgCI2, 10 mM 
DTT and 1 mM ATP) at 15 °C for 12 h. The ligation reaction was transformed into 
the DH5a strain of E. coli and transformants were plated on LB plates containing 
ampicillin and screened for the dnaX insert using the radiolabeled PCR probe as 

15 follows: 

1 . The colonies tested were lifted onto MSI filters, approximately 1 00 colonies to 
each filter. 

2. The filters, removed from the LB/Tc plates, were placed side up on a sheet of 
Whatman 3 MM paper soaked with 0.5 M NaOH for 5 min. 

20 3. The filters were transferred to a sheet of paper soaked with 1 M Tris-HCl (pH 7.5) 
for 5 min. 

4. The filters were placed on a sheet of paper soaked in 0.5 M Tris-HCl (pH 7.5), 1.25 
M NaCl for 5 min. 

5. After drying by air, the filters were heated in the oven 80 °C for 15 min. and then 
25 were analyzed by Southern hybridization. 

Plasmid DNA was prepared from 20 positive colonies; of these 6 contained the 
expected 4 kb insert when digested with Xbal. Sequencing of the insert was 
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performed by the Sanger method using the Vent polymerase sequencing kit according 
to the manufacturers instructions (New England Biolabs). 

Identification of the dnaXgenc 

The dnaX genes of the gram negative, E, coli, and the gram positive B. subtilis, share 
5 more than 50% identity in amino acid sequence within the N-terminal 180 residues 
containing the ATP-binding domain (Fig. 2). Two highly conserved regions (shown 
in bold in Fig. 2) were used to design oligonucleotide primers for application of the 
polymerase chain reaction to T.th. genomic DNA. The expected PCR product, 
including the restriction sites (i.e. before cutting) is 345 nucleotides. Use of these 

10 primers with genomic T.th. DNA resulted in a product of the expected size. The PCR 
product was then radiolabeled and used to probe genomic DNA in a Southern 
analysis (Fig. 3). Genomic DNA was digested with several different restriction 
endonucleases, electrophoresed in a native agarose gel and then probed with the PCR 
fragment. The Southern analysis showed an Xbal fragment of approximately 4 kb, 

15 more than sufficient length to encode the dnaX gene. Other, restriction nucleases 

produced fragments that were significantly longer, or produced two or more fragments 
indicating presence of a site within the coding sequence of dnaX. 

To obtain full length dnaX, genomic DNA was digested with Xbal and ligated into 
20 Xbal digested Alter- 1 vector. Ligated DNA was transformed into DH5 alpha cells, 
and colonies were screened with the labeled PCR probe. Plasmid DNA was prepared 
from 20 positive colonies and analyzed for the appropriate sized insert using Xbal. 
Six of the twenty clones contained the expected 4 kb Xbal fragment as an insert, the 
sequence of which is shown in Figs. 4A and 4B. 

25 The frameshift site 

The dnaX gene of E. coli produces two proteins, the y and t subunits, by a -1 
frameshift (Tsuchihashi and Kornberg, 1990; Flower and McHenry, 1990; Blinkowa 
and Walker, 1990). The full length product yields x, and the frameshift results in 
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addition of one amino acid before encountering a stop codon to produce y. The -1 
frameshift site in the E. coli dnaX gene contains the sequence, A AAA AAG, which 
follows the X XX Y YYZ rule found in retroviral genes (Jacks et. aL 1988). This 
"slippery sequence" preserves the initial two residues of the tRNAs in the aminoacyl 
5 and peptidyl sites both before and after the frameshift. Mutagenesis of the E. coli 

frameshifting site has shown that the first three residues can be nucleotides other 
than A, but that A's in the second set of three nucleotides is important to frameshifting 
(Tsuchihashi and Brown, 1992). 

Immediately downstream of the stop codon is a potential stem-loop structure which 
enhances frameshifting, presumably by causing the ribosome to pause. Further, the 
AAG codon lacks a cognate tRNA in E. coli and thus the G residue may facilitate the 
pause, and has been shown to aid the vigorous frameshifting observed in the E. coli 
dnaX gene (Tsuchihashi and Brown, 1992). A fourth component of frameshifting in 
the E. coli dnaX gene is presence of an upstream Shine-Dalgarno sequence which is 
thought to pair with the 16S rRNA to increase the frequency of frameshifting still 
further (Larsen et. aL. 1994). 

Examination of the T.th. dnaX sequence reveals a single site that fulfills the X XXY 
YYZ rule in which positions 4-7 are A residues. The site is unique from that in E. coli 
as all seven residues are A, and the heptanucleotide sequence is flanked by another A 
20 residue on each side (i.e. A9). Surprisingly, the stop codon immediately downstream 
of this site is in the -2 frame, although there is a stop codon in the -1 frame 28 
nucleotides downstream of the -2 stop codon. Indeed, a -2 frameshift would fulfill the 
requirement that the first two nucleotides of each codon in the peptidyl and aminoacyl 
sites be conserved during either a -1 or a -2 frameshift. As with the case of E. coli 
25 dnaX, there are secondary structure step loop structures immediately downstream. 
Finally, there is a Shine-Dalgarno sequence immediately adjacent to the frameshift 
site, as well as another Shine-Dalgarno sequence 22 nucleotides upstream of the 
frameshift site. 
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Assuming the first stop codon is utilized (i.e. -2 frameshift), the predicted size of the y 
subunit in T.th is 454 amino acids for a mass of 49.8 kDa, over 2 kDa larger than the 
43 1 residue y subunit (47.5 kDa) of E. coli. This would result in 2 residues after the 
-2 frameshift (i.e. after the GluLysLys, the residues LysAla would be added) to be 
5 compared to the result of the -1 frameshift in E. coli which also results in 2 residues 
(LysGlu). In the event that a -1 frameshift were utilized in the T.th. dnaX gene, then 
an additional 12 residues would be added following the frameshift for a molecular 
mass of 50.8 kDa (i.e. after the GluLysLys, the residues 

LysProAspProLysAlaProProGlyProThrSer would be added). As explained later, this 
10 nucleotide sequence was found to promote both -1 and -2 frameshifting in E. coli (Fig. 
8). But first, we examined T.th ceils by Western analysis for the presence of two 
subunits homologous to E. coli y and x. 

EXAMPLE 2 

Frameshifting analysis of the T.th. cfaaXgene 

1 5 Frameshifting was analyzed by inserting the frameshift site into lacZ in the three 
different reading frames, followed by plating onX-gal and scoring for blue or white 
colony formation (Weiss et. aL, 1987). The frameshifting region within T.th dnaX 
was subcloned into the EcoRI/BamHI sites of pUC19. These sites are within the 
polylinker inside of the B-galactosidase gene. Three constructs were produced such 

20 that the insert was either in frame with the downstream coding sequence of 

6-galactosidase, or were out of frame (either -1 or -2). An additional three constructs 
were designed by mutating the frameshift sequence and then placing this insert into 
the three reading frames of the B-galactosidase gene. These six plasmids were 
constructed as described below. 

25 The upstream primer for the shifty sequences was 5 ? -gcg egg ate egg agg gag aaa aaa 
aaa gec tea gec ca-3'. The BamHI site for cloning into pUC is underlined. Also, the 
stop codon, tga, has been mutated to tea (also underlined). The upstream primer for 
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the mutant shifty sequence was: 5'-gcg egg ate egg agg gag aga aga aaa gee tea gee 
ca-3\ The mutant sequence contains two substitutions of a G for an A residue in the 
polyA stretch (underlined). Three downstream primers were utilized with each 
upstream primer to create two sets of three inserts in the 0 frame, -1 frame and -2 
5 frame. The sequence of these primers, and the length of insert (after cutting with 
EcoRI and BanHI and inserting into pUC19) are as follows: 5'-gaa tta aat teg cgc ttc 
ggg agg tgg g-3' (0 frameshift, total 58 nucleotide insert); 5'-gcg cga att cgc get teg 
gga ggt ggg-3' (-1 frame, 54mer insert); and 5'-gcg cga att egg gcg ctt cag gag gtg 
gg-3' (-2 frame, 56mer insert). The downstream primers have an EcoRI site 
10 (underlined); the EcoRI site of the 0 frame insert was blunt ended to produce the 

greater length insert (converting the EcoRI site to an aattaatt sequence). Also, the teg 
sequence, which produces the tga stop codon (underlined) was mutated to tea in the -2 
downstream primer so that readthrough would be allowed after the frameshift 
occurred. 

1 5 In summary, a region surrounding the frameshift site and ending at least 5 nucleotides 
past the -1 frameshift stop codon was inserted into the ft galactosidase gene of pUC19 
in the three different reading frames (stop codons were mutated to prevent stoppage 
following a frameshift). These three plasmids were introduced into E. coli and plated 
with X-gaL The results, in Fig. 8, show that blue colonies were observed after 24 h 

20 incubation with all three plasmids and therefore both -1 and -2 frameshifting had 
occurred. 

To further these results, two y residues were introduced into the polyA tract which 
should disrupt the ability of this sequence to direct frameshifts. The mutated slippery 
sequence was inserted into pUC19 followed by transformation into E. coli and plating 
25 on X-gal. The results showed that both -1 and -2 frameshifting was prevented, further 
supporting the fact that frameshifting requires the polyA tract as expected (Fig. 8). 
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EXAMPLE 3 

Expression vector for T.th. y and t 

The dnaX gene was cloned into the pET16 expression vector in the steps shown in 
Fig. 9. First, the bulk of the gene was cloned into pET16 by removing the Pmll/Xbal 
5 fragment from pAlterdnaX, and placing it into Smal/Xbal digested Pucl9 to yield 
Pucl9dnaXCterm. The N-terminal sequence of the dnaX gene was then reconstructed 
to position an Ndel site at the N-terminus. This was performed by amplifying the 5 ? 
region encoding the N-terminal section of ylx using an upstream primer containing an 
Ndel site that hybridizes to the dnaX gene at the initiating gtg codon (i.e. to encode 

10 Met where the Met is created by the PCR primer, and the Val is the initiating gtg start 
codon of dnaX). The primer sequence for this 5' end was: 5'-gtggtgcatatg gtg age gec 
etc tac cgc c-3* (where the Ndel site is underlined, and the coding sequence of dnaX 
follows). The downstream primer hybridizes past the Pmll site at nucleotide positions 
987 - 1004 downstream of the initiating gtg (primer sequence: 5'-gtggtggtcgac cca 

15 gga ggg cca cct cca g-3 f where the initial 12 nucleotides contain a SalGI restriction 
site, followed by the sequence from the region downstream the stop codon). The 1.1 
kb nucleotide PCR product was digested with Pmll/Ndel and the Pmll/Ndel fragment 
was ligated into Ndel/Pmll digested Puc 1 9dnaXCterm to form Pucl9dnaX. The 
Pucl9dnaX plasmid was then digested with Ndel and Sail and the 1 .9 kb fragment 

20 containing the dnaX gene was purified using the Sephaglas BandPrep Kit 

(Pharmacia-LKB). pET 16b was digested with Ndel and XhoL Then the full length 
dnaX gene was ligated into the digested pET16b to form pETdnaX. 

EXAMPLE 4 - Expression of T.th. y and r 

As discussed in the previous example, the dnaX gene was engineered into the T7 
25 based IPTG inducible pET16 vector such that the initiation codon was placed 

precisely following the Met residue N-terminal leader sequence (Fig. 9). This should 
produce a protein containing the entire sequence of y and t, along with a 21 residue 
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leader containing 10 contiguous His residues (tagged-x = 60.6 kDa; tagged- y = 52.4 
kDa for -2 frameshift). The pET^/zaA'pIasmid was introduced into BL21(DE3)pLysS 
cells harboring the gene encoding T7 RNA polymerase under control of the lac 
repressor. Log phase cells were induced with IPTG and analyzed before and after 
5 induction in an SDS polyacrylamide gel (Fig. 10, lanes 1 and 2). The result shows 
that upon induction, two new proteins are expressed with the approximate sizes 
expected of the Tth. y and x subunits (larger than E. coli y, and smaller than E. coli 
x). The two proteins are produced in nearly equal amounts, similar to the case of the 
E, coli y and x subunits. Western analysis using antibodies against the E. coli y and x 
10 subunits cross reacted with the induced proteins further supporting their identity as 
TJh, y and x (data not shown, but repeated with the pure subunits shown in Fig. 10, 
lane 6). 



EXAMPLE 5 

Purification of TJh. y and x 

15 The His-tagged Tth. y and x proteins were purified from 6 L of induced E. coli cells 
containing the pETdnaX plasmid. Cells were lysed. clarified from cell debris by 
centrifugation and the supernatant was applied to a HiTrap chelate affinity column. 
Elution of the chelate affinity column yielded approximately 35 mg of protein in 
which the two predominant bands migrated in a region consistent with the molecular 

20 weight predicted from the dnaXgene (Fig. 10, lane 3), and produced a positive signal 
by Western analysis using polyclonal antibody directed against the E. coli y and x 
subunits (lane 4). The y and x subunits are present in nearly equal amounts consistent 
with the nearly equal expression of these proteins in E. coli cells harboring the 
pETdnaX plasmid. 

25 The y and x subunits were further purified by gel filtration on a Superose 12 column 
(Fig. 10, lane 4; Fig. 1 1). Recovery of Tth. y and x subunits through gel filtration 
was 81%. The E, coli y and x subunits, when separated from one another, elute 
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during gel filtration as tetramers. A mixture of E. coli ylx results in a mixed tetramer 
of y2t2 along with t4 and y4 tetramers (Onrust et. al., 1995). The mixture of T.th. 
ylx elutes ahead of the 150 kDa marker, and thus is consistent with the expected mass 
of a y2t2 tetramer (225 kDa) and t4 and y4 tetramers. 

5 As described earlier, the dnaX frameshifting sequence could produce either a - 1 or -2 
framehift to yield a His-tagged y subunit of mass either 53.3 kDa or 52.4 kDa, 
respectively. The difference in these two possible products is too close to determine 
from migration in SDS gels. It also remains possible that two y products are present 
and do not resolve under the conditions used. The exact protocol for this purification 

10 is described below. 

Six liters of BL2 1 (DE3)pLysSpETdnaX cells were grown in LB media containing 50 
ug/ml ampicillin and 25 ug/ml chloramphenicol at 37° C to an O.D. of 0.8 and then 
IPTG was added to a concentration of 2 mM. After a further 2 h at 37°C, cells were 
harvested by centrifugation and stored at -70°C. The following steps were performed 
15 at 4°C. Cells (1 5 g wet weight) were thawed and resuspended in 45 ml IX binding 
buffer (5 miM imidizole, 0.5 M NaCl, 20 mM Tris HC1 (final pH 7.5)) using a dounce 
homogenizer to complete cell lysis and 450 ml of 5% polyamine P (Sigma) was 
added. Cell debris was removed by centrifugation at 18,000 rpm for 30 min. in a 
Sorvall SS24 rotor at 4°C. The supernatant (Fraction I, 40 ml, 376 mg protein) was 
20 applied to a 5 ml HiTrap Chelating Separose column (Pharmacia-LKB). The column 
was washed with 25 ml of binding buffer, then with 30 ml of binding buffer 
containing 60 mM imidizole, and then eluted with 30 ml of 0.5 M imidizole, 0.5 M 
NaCl, 20 mM Tris-HCl (pH 7.5). Fractions of 1 ml were collected and analyzed on an 
8% Coomassie Blue stained SDS polyacrylamide gel. Fractions containing subunits 
25 migrating at the T. th y and x positions, and exhibiting cross reactivity with antibody 
to E. coli y and t in a Western analysis, were pooled and dialyzed against buffer A 
(20 mM Tris-HCl (pH 7.5), 0.1 mM EDTA, 5 mM DTT and 10% glycerol) containing 
0.5 M NaCl (Fraction II, 36 mg in 7 ml). Fraction II was diluted 2-fold with buffer A 
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and passed through a 2 ml ATP agarose column equilibrated in buffer A containing 
0.2 M NaCl to remove any E. coli y complex contaminant. Then 0.18 mg (300 ml) 
Fraction II was gel filtered on a 24 ml Superose 12 column (Pharmacia-LKB) in 
buffer A containing 0.5 M NaCl. After the first 216 drops, fractions of 200 p.1 were 
5 collected (Fraction III) and analyzed by Western analysis (by procedures similar to 
those described in Example 6), by ATPase assays and by Coomassie Blue staining of 
an 8% Coomassie Blue stained SDS polyacrylamide gel. The Coomassie stained gels 
and Western analysis of recombinant T.th. gamma and tau for these purification steps 
are summarized in Fig. 10. 

10 EXAMPLE 6 

Western Analysis of TJh. cells for presence of y and t subunits 

Polyclonal antibody to E. coli ylx - E. coli y subunit was prepared as described ( 
Studwell-Vaughan and O'Donnell, 1991). Pure y subunit (100 \ig) was brought up in 
Freund's adjuvant and injected subcutaneously into a New Zealand Rabbit (Poccono 
15 Rabbit Farms). After two weeks, a booster consisting of 50 \xg y in Freund's adjuvant 
was administered, followed after two weeks by a third injection (50 jag). 

The homology between the amino terminal regions of T.th. and E. coli ylx subunits 
suggested that there may be some epitopes in common between them. Hence, 
polyclonal antibody directed against the E. coli ylx subunits was raised in rabbits for 
20 use in probing T.th. cells by Western analysis. Fig. 7 shows the results of a Western 
analysis of whole T.th. cells lysed in SDS. The results show that in T.th. cells, the 
antibody is rather specific for two high molecular proteins which migrate in the 
vicinity of the molecular masses of E, coli y and x subunits. 



Procedure for Western Analysis 
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Samples were analyzed in duplicate 10 % SDS poly aery lamide gels by the Western 
method (Towbin et. al. 1 979). One gel was Coomassie stained to evaluate the pattern 
of proteins present, and the other gel was then electroblotted onto a nitrocellulose 
membrane (Schleicher and Schuell). For molecular size markers, the kaliedoscope 
5 molecular weight markers (Bio-Rad) were used to verify by visualization that transfer 
of proteins onto the blotted membrane had occurred. The gel used in electroblotting 
was also stained after electroblotting to confirm that efficient transfer of protein had 
occured. Membranes were blocked using 5% non-fat milk, washed with 0.05% 
Tween in TBS (TBS-T) and then incubated for over 1 h with a 1/5000 dilution of 
10 rabbit polyclonal antibody directed against E. coli y and x in 1 % gelatin in TBS-T at 
room temperature. Membranes were washed using TBS-T buffer and then antibody 
was detected on X-ray film (Kodak) by using the ECL kit from (Amersham) and the 
manufactures reccommended procedures. 

Samples included: 1 ) a mixture of E. coli y (15 ng) and x (15 ng) subunits; 2) T.th. 
15 whole cells (100 suspended in cracking buffer; and 3) purified T.th. y and x 
fraction II (0.6 |j,g as a mixture). 

EXAMPLE 7 
Characterization of the ATPase Activity of y/x - 

The E. coli x subunit is a DNA dependent ATPase (Lee and Walker, 1987; 

20 Tsuchihashi and Romberg, 1989). The y subunit binds ATP but does not hydrolyze it 
even in the presence of DNA unless other subunits of the DNA polymerase III 
holoenzyme are also present (Onrust et. al., 1991). Next we examined the TJh y/x 
subunits for DNA dependent ATPase activity. The ylx preparation was, in fact, a 
DNA stimulated ATPase (Fig. 1 1, top panel). The specific activity of the TJh. y/x 

25 was 11.5 mol ATP hydrolyzed/mol y/x (as monomer and assuming an equal mixture 
of the two). Furthermore, analysis of the gel filtration column fractions shows that the 
ATPase activity coelutes with the T.th. g/t subunits, supporting evidence that the weak 
ATPase activity is intrinsic to the y/x subunits (Fig. 11). The specific activity of the 
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y/x preparation before gel filtration was the same as after gel filtration (within 10%), 
further indicating that the DNA stimulated ATPase is an inherent activity of the y/x 
subunits. Presumably, only the t subunit contains ATPase activity, as in the case of 
E, coli. Assuming only Tth x contains ATPase activity, its specific activity is twice 
5 the observed rate (after factoring out the weight of y). This rate is still only one-fifth 
that of E. coli x. 

The Tth y/x ATPase activity is lower at 37 °C than at 65 °C (middle panel), 
consistent with the expected behavior of protein activity from a thermophilic source. 
However, there is no apparent increase in activity in proceeding from 50°C to 65 °C 
10 (the rapid breakdown of ATP above 65 ° C precluded measurement of ATPase activity 
at temperatures above 65 °C). In contrast, the E. coli x subunit lost most of its ATPase 
activity upon elevating the temperature to 50 °C (middle panel). These reactions 
contain no stabalizers such as a nonionic detergent or gelatin, nor did they include 
substrates such as ATP, DNA or magnesium. 

15 Last the relative stability of Tth y/x and E. coli y/x to addition of NaCl (Fig. 12, 
bottom panel) was examined. Whereas the E. coli x subunit rapidly lost activity at 
even 0.2 M NaCL the Tth y/x retained full activity in 1.0 M NaCl and was still 80 % 
active in 1.5 M NaCL The detailed procedure for the ATPase activity assay is 
described below. 

20 ATPase assays : ATPase assays were performed in 20 jal of 20 mM Tris-HCl (pH 
7.5), 8 mM MgCl 2 containing 0.72 \xg of M13mpl8 ssDNA (where indicated), 100 
mM [y- 32 P]-ATP (specific activity of 2000-4000 cpm/pmol), and the indicated 
protein. Some reactions contained additional NaCl where indicated. Reactions were 
incubated at the temperatures indicated in the figure legends for 30 min. and then were 

25 quenched with an equal volume of 25 mM EDTA (final). The aliquots were analyzed 
by spotting them ( 1 \xl each) onto thin layer chromatography (TLC) sheets coated with 
Cel-300 polyethyleneimine (Brinkmann Instruments Co.). TLC sheets were 
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developed in 0.5 M lithium chloride, 1 M formic acid. An autoradiogram of the TLC 
chromatogram was used to visualize Pi at the solvent front and ATP near the origin 
which were then cut from the TLC sheet and quantitated by liquid scintillation. The 
extent of ATP hydrolyzed was used to calculate the mol of Pi released per mol of 
5 protein per min. One mol of E. coli x was calculated assuming a mass of 71 kDa per 
monomer. The T.th. y and x preparation was treated as an equal mixture and thus one 
mole of protein as monomer was the average of the predicted masses of the y and x 
subunits (54 kDa). 

EXAMPLE 7 

Western analysis of T.th. cells for presence of y and t subunits - The homology 
between the amino terminal regions of T.th and E. coli y/x subunits suggested that 
there may be some epitopes in common between them. Hence, polyclonal antibody 
directed against the E. coli y/x subunits was raised in rabbits for use in probing T.th. 
cells by Western analysis. Fig. 7 shows the results of a Western analysis of whole 
T.th. cells lysed in SDS. The results show that in T.th. cells, the antibody is rather 
specific for two high molecular proteins which migrate in the vicinity of the molecular 
masses of E. coli y and x subunits. 



EXAMPLE 8 

Homolog of T.th. y/rto dnaX gene products of other organism 
20 The Xbal insert encoded an open reading frame, starting with a GTG codon, of 529 
amino acids in length (58.0 kDa), closer to the predicted length of the B. subtilis x 
subunit (563 amino acids, 62.7 kDa mass)(Alonso et. al., 1986) than the E. coli x 
subunit (71.1 kDa)(Yin et. al., 1986). dnaX encoding the y/x subunits of E. coli DNA 
polymerase III holoenzyme is homologous to the holB gene encoding the 5 r subunit of 
25 the y complex clamp loader, and this homology extends to all 5 subunits of the 
eukaryotic RFC clamp loader as well as the bacteriophage gene protein 44 of the 
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gp44/62 clamp loading complex (O'Donnell et. al., 1993). These gene products show 
greatest homology over the N-terminal 166 amino acid residues (of E. coli dnaX); the 
C-terminal regions are more divergent. Fig. 4 shows an alignment of the amino acid 
sequence of the N-terminal regions of the Tth dnaX gene product to those of several 
5 other bacteria. The consensus GXXGXGKT motif for nucleotide binding, is 

conserved in all these protein products. Further, the E. coli 5* crystal structure reveals 
one atom of zinc coordinated to four Cys residues (Guenther, 1996). These four Cys 
residues are conserved in the E. coli dnaX gene, and the y and x subunits encoded by 
E. coli dnaX bind one atom of zinc (J. Turner and M. O'Donnell, unpublished). These 

10 Cys residues are also conserved in Tth dnaX (shown in Fig. 4). Overall, the level of 
amino acid identity relative to E. coli dnaX in the N-terminal 165 residues of Tth 
dnaX is 53 %. The Tth dnaX gene is just as homologous to the B. subtilis dnaX (53 % 
identity) gene relative to E. coli dnaX. After this region of homology, the C-terminal 
region of Tth dnaX shares 26% and 20% identity to E. coli and B. subtilis dnaX, 

15 respectively. A proline rich region, downstream of the conserved region, is also 

present in Tth dnaX (residues 346-375), but not in the B. subtilis dnaX (see Figs. 3A 
and 3B). The overall identity between E, coli dnaX and Tth dnaX over the entire gene 
is 34%. Identity of Tth. dnaX to B. subtilis dnaX over the entire gene is 28%. 

Comparison of dnaX genes from T th. and E. coli 

20 The above identifies a homologue of the dnaX gene of E. coli in Thermus 

thermophilus. Like the E. coli gene, Tth dnaX encodes two related proteins through 
use of a highly efficient translational frameshift. The Tth. y/x subunits are tetramers, 
or mixed tetramers, similar to the y and x subunits of E. coli. Further, the y/x subunit 
is a DNA stimulated ATPase like its E. coli counterpart. As expected for proteins 

25 from a thermophile, the Tth. y/x ATPase activity is thermostabile and resistant to 
added salt. 
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In E. coli, Y is a component of the clamp loader, and the % subunit serves the function 
of holding the clamp loading apparatus together with two DNA polymerases for 
coordinated replication of duplex DNA. The presence of y in T.th. suggests it has a 
clamp loading apparatus and thus a clamp as well. The presence of the t subunit T.th. 
5 implies that T.th. contains a replicative polymerase with a structure similar to that of 
E. coli DNA polymerase III holoenzyme. 

A significant difference between £ coli and T.th. dnaX genes is in the translational 
frarneshift sequence. In E. coli, the heptamer frameshift site contains six A residues 
followed by a G residue in the context A AAA AAG. This sequence satisfies the X 

10 XXY YYZ rule for -1 frameshifting. The frameshift is made more efficient by the 

absence of the AAG tRNA for Lys which presumably leads to stalling of the ribosome 
at the frameshift site and increases the efficiency of frameshifting (Tsuchihashi and 
Brown, 1992). Two additional aids to frameshifting include a downstream hairpin, 
and an upstream Shine-Dalgarno sequence (Tsuchihashi and Kornberg, 1990; Larsen 

15 et. al., 1994). The -1 frameshift leads to incorporation of one unique residue at the 
C-terminus of E. coli y before encounter with a stop codon. 

In T.tkAhe dnaX frameshifting heptamer is A AAA AAA, and it is flanked by two 
other A residues, one on each side. There is also a downstream region of secondary 
structure. The nearest downstream stop codon is positioned such that gamma would 

20 contain only one unique amino acid, as in E. coli. However, the Ttk stop codon is in 
the -2 reading frame thus requires a -2 frameshift. No precedent exists in nature for -2 
frameshifting, although -2 frameshifting has been shown to occur in test cases (Weiss 
et. al, 1987). In vivo analysis of the T.th. frameshift sequence shows that this natural 
sequence promotes both -1 and -2 frameshifting in E. coli. Whereas the -2 frameshift 

25 results in only one unique C-terminal residue, a - 1 frameshift would result in an 

extension of 12 C-terminal residues. At present, the results do not discriminate which 
path occurs in T.th., a -1 or -2 frameshift, or a combination of the two. 
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There are two Shine- Dalgarno sequences just upstream of the frameshift site in T.th. 
dnaX. In two cases of frameshifting in E, colu an upstream Shine-Dalgarno sequence 
has been shown to stimulate frameshifting (reviewed in Weiss et. al., 1897). In 
release factor 2 (RF2). the Shine-Dalgarno is 3 nucleotides upstream of the shift site, 
5 and it stimulates a +1 frameshift event. In the case of E. coli dnaX, a Shine-Dalgarno 
sequence 10 nucleotides upstream of the shift sequence stimulates the -1 frameshift. 
One of the T.th. dnaX Shine-Dalgarno sequences is immediately adjacent to the 
frameshift sequence with no extra space, the other is 22 residues upstream of the 
frameshift site. Which of these Shine-Dalgarno sequences plays a role in TJh dnaX 
10 frameshifting, if any, will require future study. 

In E. coli, efficient separation of the two polypeptides, y and x, is achieved by 
mutation of the frameshift site such that only one polypeptide is produced from the 
gene (Tsuchihashi and Kornberg, 1990), Substitution of G-to-A in two positions of 
the heptamer of T.th dnaX eliminates frameshifting and thus should be a source to 
15 obtain x subunit free of y. To produce pure y subunit free of x 9 the frameshifting site 
and sequence immediately downstream of it can be substituted for an in frame 
sequence with a stop codon. 

Examination of the B. subtilis dnaXgene shows no frameshift sequence that satisfies 
the X XX Y YYZ rule. Hence, it would appear that dnaX does not make two proteins 
20 in this gram positive organism. 

Rapid thermal motions associated with high temperature may make coordination of 
complicated processes more difficult. It seems possible that organizing the 
components of the replication apparatus may become yet more important at higher 
temperature. Hence, production of a x subunit that could be used to crosslink two 
25 polymerases and a clamp loader into one organized particle may be most useful at 
elevated temperature. 
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As stated above, the following examples describe the continued isolation and 
purification of the substantial entirety of the Polymerase III from the extreme 
thermophile Thermits thermophilics. It is to be understood that the following 
exposition is reflective of the protocol and characteristics, both morphological and 
5 functional, of the Polymerase Ill-type enzymes that are the focus of the present 

invention, and that the invention is hereby illustrated and comprehends the entire class 
of en2ymes of thermophilic origin. 

EXAMPLE 9 

Purification of the Thermus thermophilus DNA polymerase III 
All steps in the purification assay were performed at 4°C. The following assay was 
used in the purification of DNA polymerase from T.th. cell extracts. Assays contained 
2.5 mg activated calf thymus DNA (Sigma Chemical Company) in a final volume of 
25 ml of 20 mM Tris-Cl (pH 7.5), 8 mM MgCl 2 , 5 mM DTT, 0.5 mM EDTA, 40 
mg/ml BSA, 4% glycerol, 0.5 mM ATP. 3 mM each dCTP, dGTP, dATP, and 20 mM 
[a- 32 P]dTTP. An aliquot of the fraction to be assayed was added to the assay mixture 
on ice followed by incubation at 60°C for 5 min. DNA synthesis was quantitated 
using DE8 1 paper followed by washing off unincorporated nucleotide. Incorporated 
nucleotide was determined by scintillation counting of the filters. 

Thermus thermophilus cell extracts were prepared by suspending 35 grams of cell 
20 paste in 200 ml of 50 mM TRIS-HC1, pH=7.5, 30 mM spermidine, 1 00 mM NaCl, 0.5 
mM EDTA, 5 mM DTT, 5% glycerol, followed by disruption by passage through a 
French pressure cell (15,000 PSI). Cell debris was removed by centrifugation (12,000 
RPM, 60 min). DNA polymerase III in the clarified supernatant was precipitated by 
treatment with ammonium sulphate (0.226 gm/liter) and recovered by centrifugation. 
25 This fraction was then backwashed with the same buffer (but lacking spermidine) 

containing 0.20 gm/1 ammonium sulfate. The pellet was then resuspended in buffer A 
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and dialyzed overnight against 2 liters of buffer A; a precipitate which formed during 
dialysis was removed by centrifugation (17,000 RPM, 20 min). 

The clarified dialysis supernatant, containing approximately 336 mg of protein, was 
applied onto a 60 ml heparin agarose column equilibrated in buffer A which was 
washed with the same buffer until A280 reached baseline. The column was developed 
with a 500 ml linear gradient of buffer A from 0 to 500 mM NaCl. More tightly 
adhered proteins were washed off the column by treatment with buffer A (20 mM Tris 
Hcl, pH = 7.5, 0.1 mM EDTA, 5mM DTT, and 10% glycerol) and 1M NaCl. Some 
DNA polymerase activity flowed through the column. Two peaks (HEP.P1 and 
HEP.P2) of DNA polymerase activity eluted from the heparin agarose column 
containing 20 mg and 2 mg of total protein respectively (Fig. 13 A). These were kept 
separate throughout the remainder of the purification protocol. 

The Pol III resided in HEP.P1 as indicated by the following criteria: 1) Western 
analysis using antibody directed against the a subunit of E. -coli Pol III indicated 
presence of Pol III in HEP .PI, 2) Only the HEP. PI fraction was capable of extending 
a single primer around an M13mpl8 7.2 kb ssDNA circle (explained later in Example 
14). This type of long primer extension is a characteristic of Pol III type enzymes. 3) 
Only the HEP.P1 provided DNA polymerase activity that was retained on an 
ATP-agarose affinity column. This is indicative of a Pol Ill-type DNA polymerase 
since the y and t subunits are ATP interactive proteins. 

The first peak of the heparin agarose column (HEP.P1: 20 mg in 127.5 ml) was 
dialyzed against buffer A and applied onto a 2ml N6-linkage ATP agarose column 
pre-equilibrated in the same buffer. Bound protein was eluted by a slow (0.05 
25 ml/min) wash with buffer A + 2M NaCl and collected into 200 \xl fractions. 

Chromatography of peak HEP.P1 yielded a flow-through (HEP.P1-ATP-FT) and a 
bound fraction (HEP.Pl-ATP-Bound) (Fig. 13B). Binding of peak HEP.P2 to the 
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ATP column could not be detected, though DNA polymerase activity was recovered 
in the flow-through. 

The HEP.Pl-ATP-Bound fractions from the ATP agarose chromatographic step were 
further purified by anion exchange over monoQ. The HEP.Pl-ATP-Bound fractions 
were diluted with buffer A to approximately the conductivity of buffer A plus 25 mM 
NaCl and applied to a 1ml monoQ column equilibrated in Buffer A. DNA 
polymerase activity eluted in the flow-through and in two resolved chromatographic 
peaks (MONOQ peakl and peak2) (Fig. 13C). Peak 2 was by far the major source of 
DNA polymerase activity. Western analysis using rabbit antibody directed against the 
E. coli a subunit confirmed presence of the a subunit in the second peak (see the 
Western analysis in Fig. 14B). Antibody against the E. coli y subunit also confirmed 
the presence of the y subunit in the second peak (not shown). Some reaction against a 
and y was also present in the minor peak (first peak). The Coomassie Blue SDS 
polyacrylamide gel of the MonoQ fractions (Fig. 14A) showed a band that co- 
migrated with E. coli a and was in the same postion as the antibody reactive material 
(antibody against E. coli a). Also present are bands corresponding to z, , 8 and 5'. 
These subunits. along with B, are all that is necessary for rapid and processive 
synthesis and primer extension over a long (> 7 kb) stretch of ssDNA in the case ofE. 
coli DNA Polymerase III holoenzyme. 

The Pol Ill-type enzyme purified from T.th. may be a Pol III*-like enzyme that 
contains the DNA polymerase and clamp loader subuits (i.e. like the Pol III* of E. 
coli). The evidence for this is: 1) the presence of dnaXand dnaE gene products in the 
same column fractions as indicated by Western analysis (see above); 2) the ability of 
this enzyme to extend a primer around a 7.2 kb circular ssDNA upon adding only B 
(see Example 14); 3) stimulation of Pol III by adding B on linear DNA, indicating B 
subunit is not present in saturating amounts (see Example 13); and 4) the presence of 
z in T.th. which may glue the polymerase and clamp loader into a Pol III* as in E. 
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coli; and 5) the comigration of a with subunits y, 6 and 6' of the clamp loader in the 
column fractions of the last chromatographic step (MonoQ, Fig. 14A).. 

Micro-sequencing of T. th DNA Polymerase III a subunit 
The a subunit from the purified Tth DNA polymerase III 

5 (HEP.P 1 . ATP-Bound.MONOQ peak2) was blotted onto PVDF membrane and was 
cut out of the SDS-PAGE gel and submitted to the Protein-Nucleic Acid Facility at 
Rockefeller University for N-terminal sequencing and proteolytic digestion, 
purification and microsequencing of the resultant peptides. Analysis of the a 
candidate band (Mw = 130kD) yielded four peptides, two of which (TTH1, TTH2) 

10 showed sequence similarity to a subunits from various bacterial sources (see Fig. 15). 

EXAMPLE 10 

Identification of the Thermus thermophilic dnaE ge n e encoding the a subunit of DNA 
polymerase III holoenzvme 

Cloning of the dnaE gene was started with the sequence of the TTH1 peptide from the 
1 5 purified a subunit (FFIEIQNHGLSEQK). The fragment was aligned to a region at 
approximately 180 amino acids downstream of the N-termini of several other known 
a subunits as shown in Fig. 15. The upstream 33mer 

(5'-GTGGGATCCGTGGTTCTGGATCTCGATGAAGAA-3 ! ) consists of aBamHI 
site within the first 9 nucleotides (underlined) and the sequence coding for the 
20 following peptide HGLSEQK on the complementary strand. The downstream 29mer 
(5 ' -GTGGGATCC AC GGS CTSTC S G AGC AG AAG-3 ' ) consists of a BamHI site 
within the first 9 nucleotides (underlined) and the following sequence coding for the 
peptide FFIEIQNH. 

These two primers were directed away from each other for the purpose of perfoming 
25 inverse PCR (also called circular PCR). The amplification reactions contained lOng 
T.th. genomic DNA (that had been cut and religated with Xmal), 0.5 mM of each 
primer, in a volume of 100 \x\ of Vent polymerase reaction mixture containing 10 pi 
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ThermoPol Buffer, 0.5 mM of each dNTP and 0.25 mM MgS0 4 . Amplification was 
performed using the following cycling scheme: 

1 . 4 cycles of: 95.5°C - 30\ 45°C - 30", 75°C - 8* 

2. 6 cycles of: 95.5°C - 30", 50°C - 30 M , 75°C - 6' 

3. 30 cycles of: 95.5X - 30 f \ 52.5°C - 30", 75°C - 5' 



A l.4kb fragment was obtained and cloned into pBS-SK:BamHI (i.e. pBS-SK 
(Stratragene) was cut with BamHI). This sequence was bracketted by the 29mer 
primer on both sides and contained the sequence coding for the N-terminal part of the 
a subunit up to the peptide used for primer design. 

10 To obtain further dnaE gene sequence, the TTH2 peptide was used. It was aligned to 
a region about 600 amino acids from the N-termini of the other known a subunits 
(Fig. 15B). 

The upstream 34mer (5'-GCG GGATCC TCAACGAGGAGCTCTCCATCTTCAA-3 l ) 
consists of a BamHI site within the first 9 nucleotides (underlined) and the sequence 

1 5 from the end of the fragment previously obtained. The downstream 33mer 

(5'-GCGGGATCCTTGTCGTCSAGSGTSAGSGCGTCGTA-3 f ) consists of a BamHI 
site within the first 9 nucleotides (underlined) and the following sequence coding for 
the peptide YDALTLDD on the complementary strand. The amplification reactions 
contained 10 ng Tjh. genomic DNA, 0.5 mM of each primer, in a volume of 100 \x\ of 

20 Vent polymerase reaction mixture containing 10 |il ThermoPol Buffer, 0.5 mM of 
each dNTP and 0.25 mM Mg S0 4 . Amplification was performed using the following 
cycling scheme: 

1. 4 cycles of: 95.5°C - 30", 45°C - 30'\ 75°C - 8' 

2. 6 cycles of: 95.5°C - 30", 50°C - 30", 75°C - & 
25 3. 30 cycles of: 95.5°C - 30", 55°C - 30", 75°C - 5' 
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A 1.2kb PCR fragment was obtained and cloned into pUC19:BamHI. The fragment 
was bracketted by the downstream primer on both sides and contained the region 
overlapping in 56 bp with the fragment previously cloned. 

To obtain yet more dnaE sequence, the following primers were used. The upstream 
5 39mer (3 '-GTGTGGAI£CTCGTCCCCCTCATGCGCGACCAGGAAGGG-5') 
consists of a BamHI site within the first 10 nucleotides (underlined) and the sequence 
from the end of the fragment previously obtained. The downstream 27mer 
(5'-GTGTGGAICCTTCTTCTTSCCCATSGC-3') consists of a BamHI site within the 
first 10 nucleotides (underlined), and the sequence coding for the peptide AMGKKK 

10 (at position approximately 800 residues-from the N terminus) on the complementary 
strand. The AMGKKK sequence was chosen for primer design as it is highly 
conserved among the known gram-negative a subunits. The amplification reactions 
contained 10 ng T.th. genomic DNA, 0.5 mM of each primer, in a volume of 100 \i\ of 
Taq polymerase reaction mixture containing 10 \x\ PCR Buffer, 0.5 mM of each dNTP 

15 and 2.5 mM MgCl 2 . Amplification was performed using the following cycling 
scheme: 

1. 3 cycles of: 95.5°C - 30'\ 45°C - 30'\ 72°C - 8' 

2. 6 cycles of: 94.5°C - 30". 55°C - 30'\ 72°C - 6' 

3. 32 cycles of: 94.5°C - 30", 50°C - 30", 72°C - 5 1 

20 A 2.3kb PCR fragment was obtained instead of the expected 0.6 kb fragment. BamHI 
digestion of the PCR product resulted in three fragments of 1.1 kb, 0.7kb and 0.5kb. 
The 1.1 kb fragment was cloned into pUC19:BamHL It turned out to be the one 
adjacent to the fragment previously obtained and contained the dnaE sequence right 
up to the region coding for the AMGKKK peptide, but was disrupted by an intein just 

25 upstream of this region. 

The sequence that follows this was amplified from the 2.3kb original PCR product 
using the same conditions and cycling scheme as for the 2.3kb fragment. The 
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downstream primer was the same as in the previous step. The upstream 27mer 
(3 , -GTGTGGATCCGTGGTGACCTTAGCCAC-5*) consisted of a BamHI site 
within the first 9 nucleotides (underlined) and the sequence from the end of the 1.1 kb 
fragment previously described. 

The expected 1 .2kb PCR fragment was obtained and cloned into pUC 1 9:SmaI. This 
fragment coded for the rest of the intein and the end of it was used to obtain the next 
sequence of dnaE downstream of this region. The upstream 30mer 
(S'-TTCGTGTCCGAGGACCTTGTGGTCCACAAC-S') was a sequence from the 
end of the intein. The downstream 23mer 

(5'-CCAGAATCGTCTGCTGGTCGTAG-3 f ) was the sequence from the end of the 
dnaE gene of D.rad. (coding on the complementary strand for the region slightly 
homologous in the distantly related a subunits and possibly highly homologous 
between T.th. and Dsad. a subunits). The amplification reactions contained 10 ng 
T.th. genomic DNA. 0.5 mM of each primer, in a volume of 100 yil of Vent 
polymerase reaction mixture containing 10 |il ThermoPol Bluffer, 0.5 mM of each 
dNTP and 0.1 mM Mg S0 4 . Amplification was performed using the following 
cycling scheme: 

1. 3 cycles of: 95.5 °C - 30", 55 °C - 30", 75 °C - 8' 

2. 32 cycles of: 94.5°C - 30", 50°C - 30", 75°C - 5' 

20 A 2.5kb PCR fragment was obtained and cloned into pUC19:SmaL This fragment 
contained the dnaE sequence coding for the 300 mino acids next to the AMGKKK 
region disrupted by yet a second intein inside another sequence that is conserved 
among the known a subunits (FNKSHS A A Y) . 

To obtain the rest of the dnaE gene the upstream 19mer 
25 (S'-AGCACCCTGGAGGAGCTTC-S ') from the end of the known dnaE sequence 
was used. The downstream primer was: 5 f -CATGTCGTACTGGGTGTAC~3'. The 
amplification reactions contained 10 ng T.th. genomic DNA ? 0.5 mM of each primer, 
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in a volume of 100 jal of Vent polymerase reaction mixture containing 10 (il 
ThermoPol Buffer, 0.5 mM of each dNTP and 0. 1 mM Mg S0 4 . Amplification was 
performed using the following cycling scheme: 

1. 3 cycles of: 95.5 °C - 30", 55°C - 30", 75°C - 8* 
5 2. 32 cycles of: 94.5°C - 30", 50°C - 30", 75°C - 5' 

A 1 .Okb fragment bracketed by this upstream primer was obtained. It contained the 3' 
end of the dnaE gene. 

EXAMPLE 1 1 

Cloning and Expression of the Thermus thermophilics dnaO gene encoding the e 
10 subunit of DNA polymerase III holoenzvme 

Cloning of dnaQ - The DnaQ gene of E. coli and the corresponding region of PolC of 
B. subtilis, evolutionary divergent organisms, share approximately 30% identity. 
Comparison of the predicted amino acid sequences encoded by DnaQ of E. coli and 
PolC of B. subtilis revealed two highly conserved regions positions (Fig. 17). Within 
15 each of these regions, a nine amino acid sequence was used to design two 
oligonucleotide primers for use in the polymerase chain reaction. 

The regions highly conservative among Pol III exonucleases were chosen to design 
the degenerate primers for the amplification of a T.th. dnaQ internal fragment (see 
Fig. 17). DNA oligonucleotides for amplification of T.th genomic DNA were as 

20 follows. The upstream 27mer (5'-GTSGTSNNSGACNNSGAGACSACSGGG-3') 
encodes the following sequence (V VXDXETTG) . The downstream 27mer 
(5'-GAASCCSNNGTCGAASNNGGCGTTGTG-3') encodes the sequence 
HNAXFDXGF on the complementary strand. The amplification reactions contained 
10 ng T.th. genomic DNA, 0.5 mM of each primer, in a volume of 100 pi of Vent 

25 polymerase reaction mixture containing 10 \x\ ThermoPol Buffer, 0.5 mM of each 
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in a volume of 100 jil of Vent polymerase reaction mixture containing 10 |il 
ThermoPol Buffer, 0.5 mM of each dNTP and 0. 1 miM Mg S0 4 . Amplification was 
performed using the following cycling scheme: 

1. 3 cycles of: 95.5°C - 30", 55°C - 30", 75°C - 8* 

2. 32 cycles of: 94.5 °C - 30", 50°C - 30'\ 75 °C - 5' 

A l.Okb fragment bracketed by this upstream primer was obtained. It contained the 3' 
end of the dnaE gene. 

EXAMPLE 1 1 

Cloning and Expression of the Thermus thermophilus dnaQ gene encoding the e 
subunit of DNA polymerase III holoenzyme 

Cloning ofdnaO - The DnaQ gene of E. coli and the corresponding region of PolC of 
B. sub tilis, evolutionary divergent organisms, share approximately 30% identity. 
Comparison of the predicted amino acid sequences encoded by DnaQ of E. coli and 
PolC of B. subtilis revealed two highly conserved regions positions (Fig. 17). Within 
each of these regions, a nine amino acid sequence was used to design two 
oligonucleotide primers for use in the polymerase chain reaction. 

The regions highly conservative among Pol III exonucleases were chosen to design 
the degenerate primers for the amplification of a T.th dnaQ internal fragment (see 
Fig. 17). DNA oligonucleotides for amplification of T.th genomic DNA were as 
follows. The upstream 27mer (5 r -GTSGTSNNSGACNNSGAGACSACSGGG-3') 
encodes the following sequence (VVXDXETTG). The downstream 27mer 
(5 , -GAASCCSNNGTCGAASNNGGCGTTGTG-3 t ) encodes the sequence 
HNAXFDXGF on the complementary strand. The amplification reactions contained 
10 ng Tth genomic DNA, 0.5 mM of each primer, in a volume of 100 jal of Vent 
polymerase reaction mixture containing 10 \x\ ThermoPol Buffer, 0.5 mM of each 



r 



83 

dNTP and 0,5 mM MgS0 4 . Amplification was performed using the following cycling 
scheme: 



Products were visualized in a 1.5 % native agarose gel. A fragment of the expected 
size of 270 bp was cloned into the Smal site of pUC 19 and sequenced with the 
CircumVent Thermal Cycle DNA sequencing kit accordinig to the manufacturer's 
instructions (New England Biolabs). 

10 To obtain further sequence of the dnaO gene, genomic DNA was digested with either 
mhol, BamHI, Kpnl or NcoL These restriction enzymes were chosen because the cut 
T.th. genomic DNA frequently. 0.1 jug of DNA for each digest was ligated by T4 
DNA ligase in 50 (il of ligation buffer (50 mM Tris-HCl (pH 7.8), 10 mM MgCl 2 , 10 
mM dithiothreitol, 1 mM ATP, 25 mg/ml bovine serum albumin) overnight at 20 °C. 

1 5 The ligation mixtures were used for cicular PCR. 

DNA oligonucleotides for amplification of T.th genomic DNA were the following. 
The upstream 27mer (5'-CG GGGATCC ACCTCAATCACCTCGTGG-3^ consists of 
a BamHI site within the first 9 nucleotides (underlined) and the sequence 
complementary to 42-61 bp region of the previously cloned dnaQ fragment. The 
20 downstream 30mer (5'-CGG GGATCC GCCACCTTGCGGCTCCGGGTG-3 r ) consists 
of a BamHI site within the first 9 nucleotides (underlined) and the sequence 
corresponding to 240-261 bp region of the dnaQ fragment (see Fig. 17). 
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1. 5 cycles of: 95.5°C - 30", 40°C - 30", 72°C - 2' 

2. 5 cycles of: 95.5 °C - 30", 45 °C - 30", 72°C - 2' 

3. 30 cycles of: 95.5°C - 30", 50°C - 30", 72°C - 30 



25 



The amplification reactions contained 1 ng TJk genomic DNA (that had been cut 
with Ncol and religated into circular DNA for circular PCR), 0.4 mM of each primer, 
in a volume of 100 ul of Vent polymerase reaction mixture containing 10 jj.1 
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ThermoPol Buffer. 0.5 mM of each dNTP, 0.5 mM MgS0 4 , and 10% DMSO. 
Circular amplification was performed using the following cycling scheme: 

1. 5 cycles of: 95.5°C - 30", 50°C - 30", 72°C - 8' 

2. 35 cycles of: 95.5°C - 30", 55°C - 30", 72°C - & 
5 3.72-X- 10' 



A 1.5 kb fragment w as obtained and cloned into the BamHI site of the pUC19 vector. 
Partial sequencing of the fragment reveiled that it contained the dnaO regions adjacent 
to sequences corresponding to the PCR primers and hence contained the sequences 
both upstream and downstream of the previously cloned dnaQ fragment. One of Ncol 

10 sites turned out to be approximatly 300 bp downstream of the end of the first cloned 
dnaO sequence and hence did not include the 3' end of dnaO. To obtain the 3' end, 
another inverse PCR reaction was performed. Since an Apal restiction site was 
recognized within this newly sequenced dnaO fragment, the circular PCR procedure 
was performed using as template an Apal digest of TJh. genomic DNA that was 

15 ligated (circularized) under the same conditions as described above. 



DNA oligonucleotides for amplification of the Apal/religated T.th. genomic DNA 
were as follows. The upstream 31mer 

(5'-GCGCTCIAGACGAGTTCCCAAAGCGTGCGGT-3') consists of a mbal site 
within the first 10 nucleotides (underlined) and the sequence complementary to the 

20 region downstream of the Apal restriction site in the newly sequenced dnaQ fragment. 
The downstream 31 mer (5'-CGC GTCTAGA TCACCTGTATCCAGA-3^ consists of a 
Xbal site within the first 10 nucleotides (underlined) and the sequence corresponding 
to another region downstream of the Apal restriction site in the newly sequenced 
dnaO fragment. The 1.7 kb PCR fragment was cloned into the Xbal site of the 

25 pUC19 vector and partially sequenced. The sequence of dnaQ, and the protein 
sequence of the € subunit encoded by it, is shown in Fig, 18. 
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The dnaQ gene is encoded by an open reading frame of 209 (or 190 depending on 
which Val is used as the initiating residue) amino acids in length (23598.5 kDa - or 
2 1383.8 kDa for shorter version), similar to the length of the E.coli e subunit (243 
amino acids, 27099,1 kDa mass) (see Fig. 17). 

5 The entire amino acid sequence of the € subunit predicted from the T.th. dnaQ gene 
aligns with the predicted amino acid sequence of the dnaO genes of other organisms 
with only a few gaps and insertions (the first two amino acids, and four positions 
downstream) (Fig. 17). The consensus motifs (VVXDXETTG, HNAXFDXGF, and 
HRALYD)* characteristic for exonucleases, are conserved. Overall, the level of 
10 amino acid identity relative to most of the known e subunits, or corresponding 
proofreading exonuclease domains of gram positive PolC genes is approximately 
30%. Upstream of start 1 (Fig. 17) there were stop codons in all three reading frames. 

Expression of DnaO - The DnaQ gene was cloned gene into the pET24-a expression 
vector in two steps. First, the PCR fragment encoding the N-terminal part of the gene 
15 was cloned into the pUC19 plasmid, containing the Apal inverse PCR fragment into 
Ndel/Apal sites. DNA oligonucleotides for amplification of T.th. genomic DNA were 
as follows. The upstream 33mer 

(5 GC GGC GCAXATGGTGGTGGTC C TGG AC C TGG AG-3 ') consists of an Ndel 
site within the first 12 nucleotides (underlined) and the begining of the dnaO gene. 

20 The downstream 3 1 mer (5 , -CGCGTCTAGATCACCTGTATCCAGA-3') ! already 
used for Apal circular PCR, consists of an Xbal site within the first 10 nucleotides 
(underlined) and the sequence corresponding to the region downstream of the Apal 
restriction site. The 2.2 kb Ndel/Sall fragment was then cloned into the Ndel/Xhol 
sites of the pET16 vector to produce pET24-a:<iraO. The € subunit was expressed in 

25 the BL21/LysS strain transformed by the pET24-a:<i/?£7g plasmid. 
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The Thermus thermophilus dnaN gene encoding the B subunit of DNA polymerase III 
holoenzvme 

Strategy of cloning DnaNbv use of DnaA - DnaN proteins are highly divergent in 
bacteria making it difficult to clone them by homology. The level of identity between 
5 DnaN representatives from E. coli and B. subtilis is as low as 18%. These 18% of 
identical amino acid residues are dispersed through the proteins rather then clustering 
together in conservative regions, further complicating use of homology to design PCR 
primers. However, one feature of dnaN genes among widely different bacteria is their 
location in the chromosome. They appear to be near the origin, and immediately 
10 adjacent to the dnaA gene. DnaA genes show good homology among different 

bacteria and thus we first cloned dnaA in order to obtain a DNA probe that is likely 
near dnaN. 

Identification of dnaA and dnaN - The DnaA genes of E. coli and B. subtilis share 
58% identity at the amino acid sequence level within the AJT-binding domain (or 

1 5 among the representatives of gram-positive and gram-negative bacteria, evolutionary 
divergent organisms). Comparison of the predicted amino acid sequences encoded by 
dnaA of E. coli and B. subtilis revealed two highly conserved regions (Fig. 19). 
Within each of these regions, a seven amino acid sequence was used to design two 
oligonucleotide primers for use in the polymerase chain reaction. The DNA 

20 oligonucleotides for amplification of T.th genomic DNA were as follows. The 
upstream 20mer (S'-GTSCTSGTSAAGACSCACTT^ 1 ) encodes the following 
sequence: VLVKTHL. The downstream 21mer 

(5'-SAGSAGSGCGTTGAASGTGTG-3 r ) encodes the sequence: HTFNALL, on the 
complementary strand. The amplification reactions contained 10 ng T.th. genomic 
25 DNA, 0.5 mM of each primer, in a volume of 100 \x\ of Vent polymerase reaction 
mixture containing 10 jil ThermoPol Buffer, 0.5 mM of each dNTP and 0.5 mM 
MgS0 4 . Amplification was performed using the following cycling scheme: 
1. 5 cycles of: 95.5 °C - 30", 45 °C - 30", 75 °C - 2 
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2. 5 cycles of: 95.5°C - 30", 50°C - 30", 75°C - 2' 

3. 30 cycles of: 95.5°C - 30", 52°C - 30", 75°C - 30" 



Products were visualized in a 1 .5% native agarose gel, A fragment of the expected 
size of 300 bp was cloned into the Smal site of pUC19 and sequenced with the 
5 CircumVent Thermal Cycle DNA sequencing kit (New England Biolabs). 

To obtain a larger section of the T. th. dnaA gene, genomic DNA was digested with 
either HaelL Hindlll, KasI, Kpnl, Mlul, NcoL NgoMI, Nhel, Nsil, PaeR7I, PstI, SacL 
Sail, Spel, SphI, StuL or Xhol, followed by Southern analysis in a native agarose gel. 
The filter was probed with the 300 bp PCR product radiolabeled by random priming. 
1 0 Four different restriction digests showed a single fragment of reasonable size for 
further cloning. These were. KasI, NgoMI, and StuI which produced fragments of 
about 3 kb, and Ncol that produced a 2kb fragment. Also, a Kpnl digest resulted in 
two fragments of about 1.5 kb and 10 kb. 

Genomic DNA digests using either NgoMI and StuI were used to obtain the dnaA 
15 gene by inverse PCR (also referred to as circular PCR). In this procedure, 0.1 jig of 
DNA from each digest was treated separately with T4 DNA ligase in 50 (il of ligation 
buffer (50 mM Tris-HCl (pH 7.8), 10 mM MgCl 2 , 10 mM dithiothreitol, 1 mM ATP, 
25 mg/ml bovine serum albumin) overnight at 20 °C. This results in circularizing the 
genomic DNA fragments. The ligation mixtures were used as substrate in inverse 
20 PCR. 



DNA oligonucleotides for amplification of recircularized T. th, genomic DNA were as 
follows. The upstream 22mer was 5 f -CTCGTTGGTGAAAGTTTCCGTG-3\ and the 
downstream 24mer was 5-CGTCCAGTTCATCGCCGGAAAGGA-3'. The 
amplification reactions contained 5 ng T.th, genomic DNA, 0.5 juM of each primer, in 
25 a volume of 100 |ul of Taq polymerase reaction mixture containing 10 \il PCR Buffer, 
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0.5 mM of each dNTP and 2.5 miM MgCl : . Amplification was performed using the 
following cycling scheme: 

1. 5 cycles of: 95.0°C - 30", 55°C - 30", 72°C - 10' 

2. 35 cycles of: 95.5 °C - 30", 50°C - 30", 72°C - 8' 

5 The PCR fragments of the expected length for NgoMI and StuI treated and then 

ligated chromosomal DNA were digested with either BamHI or Sau3a and cloned into 
pUC19 : BamHI and pUC19:(BamHI+SmaI) and sequenced with CircumVent 
Thermal Cycle DNA sequencing kit. The 1.6kb (BamHI+BamH) fragment from the 
NgoMI PCR product contained a sequence coding for the N-terminal part of DnaN, 

10 followed by the gene for enolase. The lkb (Sau3a+Sau3a) fragment from the same 
PCR product included the start of dnaN gene and sequence characteristic of the origin 
of replication (i.e. 9mer DnaA-binding site sequences). The 0.6kb (BamHI+BamHI) 
fragment from the StuI PCR reaction contained starts for dnaA and gidA genes in 
inverse orientation to each other. The 0.4 kb (Sau3a+Sau3a) fragment from the same 

1 5 PCR product contained the 3 1 end of the dnaA gene and DNA sequence characteristic 
for the origin of replication. 

This sequence information provided the beginning and end of both the dnaA and the 
dnaN genes. Hence, these genes were easily cloned from this information. Further, 
the DnaN gene was readily cloned and expressed in a pET24-a vector. These steps are 
20 described below. 

Cloning and sequence of the dnaA gene - The dnaA gene was cloned for sequencing 
in two parts: from the potential start of the gene up to its middle and from the middle 
up to the end. For the N-terminal part the upstream 27mer 

(5 '-TCTGGC AAC AC GTTCTGGAGC AC ATCC-3 ') was 20 bp downsteam of the 
25 potential start codon of the gene. The downstream 23mer 

(5 '-TGCTGGCGTTC ATCTTC AGG ATG-3 ') was approximately from the middle of 
the dnaA gene. For the C-terminal part the upstream 23mer 
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(5'-CATCCTGAAGATGAACGCCAGCA-3 r ) was complementary to the previous 
primer. The downstream 25mer (5'-AGGTTATCC ACAGGGGTCATGTGCA-3') 
was 20 bp upstream the potential stop codon for the dnaA gene. The amplification 
reactions contained 1 0 ng T.th. genomic DNA* 0.5 jaM of each primer, in a volume of 
5 100 |il of Vent polymerase reaction mixture containing 10 \xl ThermoPol Buffer, 0.5 
mM of each dNTP and 0.5 mM MgS0 4 . Amplification was performed using the 
following cycling scheme: 

1. 5 cycles of: 95.5°C - 30", 55°C - 30'\ 75°C - 3' 

2. 30 cycles of: 95.5°C - 30", 50°C - 30", 75°C - 2 r 

10 Products were visualized in a 1 .0% native agarose gel. Fragments of the expected 
sizes of 750 bp and 650 bp were produced, and were sequenced using CircumVent 
Thermal Cycle DNA sequencing method (New England Biolabs). The nucleotide and 
amino acid sequences of dnaA and its protein product are shown in Fig. 20. The 
DnaA protein is homologous to the DnaA proteins of several other bacteria as shown 

15 in Fig. 19. 

Cloning and expression of dnaN - The full length dnaN gene was obtained by PCR 
from T.th total DNA. DNA oligonucleotides for amplification of Tjh. dnaN were the 
following: the upstream 29mer 

(5 f -GTGTGTCAIATGAACATAACGGTTCCCAA-3 , ) consists of an Ndel site 
20 within first 1 1 nucleotides (underlined), followed by the sequence for the start of the 
dnaN gene; the downstream 29mer 

(5'-GCGCGAATlCTCCCTTGTGGAAGGCTTAG-3 f ) consists of anEcoRI site 
within the first 10 nucleotides (underlined), followed by the sequence complementary 
to a section just downstream of the dnaN stop codon. The amplification reactions 
25 contained 10 ng T.th. genomic DNA, 0.5 jiM of each primer, in a volume of 100 jil of 
Vent polymerase reaction mixture containing 10 fil Thermopol Buffer, 0.5 mM of 
each dNTP and 0.2 mM Mg S0 4 . Amplification was performed using the following 
cycling scheme: 
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1. 5 cycles of: 95.0°C - 30", 55°C - 30", 75°C - 5', 

2. 35 cvcles of: 95.5 °C - 30", 50°C - 30", 75 °C - 4\ 



The nucleotide and amino acid sequences of dnaN and the B subunit, respectively, are 
shown in Fig. 21. The T.tk B subunit shows limited homology to the 6 subunit 
5 sequences of several other bacteria over its entire length (Fig. 22). 



The approximately 1 kb dnaN gene was cloned into the pET24-a expression vector 
using the Ndel and EcoRI restriction sites both in the dnaN containing PCR product 
and in pEt24-a (Fig. 23). Expression of TJh. B subunit was obtained under the 
following conditions: a fresh colony of B121(DE3) E.coli strain was transformed by 
10 the pET24-a:dnaN plasmid, and then was grown in LB broth containing 50 mg/ml 
kanamycin at 37°C until the cell density reached 0.4 OD 600 The cell culture was then 
induced for dnaN expression upon addition of 2 mM IPTG. Cells were harvested 
after 4 additional hours of growth under 37 °C. The induction of the T.th. B subunit is 
shown in Fig. 24. 

15 Two liters of BL21{DE3)pETd/wziVcells were grown in LB media containing 50 
mg/ml ampicillin at 37 °C to an O.D. of 0.8 and then IPTG was added to a 
concentration of 2 mM. After a further 2 h at 37°C. cells were harvested by 
centrifugation and stored at -70 °C. The following steps were performed at 4°C. 
Cells were thawed and resuspended in 40 ml of 5 mM Tris-HCl (pH 8.0), 1% sucrose, 

20 1M NaCl, 5 mM DTT, and 30 mM spermidine. Cells were lysed using a French 

Pressure cell at 20,000 psi. The lysate was allowed to sit at 4°C for 30 min. and then 
cell debris was removed by centrifugation (Sorvall SS-34 rotor, 45 min. 18,000 rpm). 
The supernatant was incubated at 65 °C for 20 minutes with occasional stirring. The 
resulting protein precipitate was removed by centrifugation as described above. The 

25 supernatant was dialyzed against 4 liters of buffer A containing 50 mM NaCl 

overnight. The dialyzed supernatant was clarified by centrifugation (35 ml, 150 mg 
total) and then loaded onto an 8 ml MonoQ column equilibrated in buffer A 
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containing 50 mM NaCl. The column was washed with 5 column volumes of the 
same buffer and then eluted with a 120 ml gradient of buffer A plus 50 mM NaCl to 
buffer A plus 500 mM NaCI. Fractions of 2 ml were collected. Over 50 mg of TJk B 
was recovered in fractions 5-21. 

5 EXAMPLE 13 

Alternate synthetic path in absence of clamp loader activity: 

As discussed earlier, the Pol Ill-type enzyme of the present invention is capable of 
application and use in a variety of contexts, including a method wherein the clamp 
loader component that is traditionally involved in the initiation of enzyme activity, is 
10 not required. The clamp loader generally functions to increase the efficiency of ring 
assembly onto circular primed DNA because both the ring and the DNA are circles 
and one must be broken transiently for them to become interlocked rings. In such a 
reaction, the clamp loader increases the efficiency of opening the ring. 

The procedure described below illustrates the instance where the clamp loader need 
1 5 not be present. For example, the 6 clamp can be assembled onto DNA in the absence 
of the clamp loader. Particularly, the bulk of primed templates in PCR reactions are 
linear ssDNA fragments that are primed at the ends. On linear primed DNA, the ring 
need not open at all. Instead, the ring can simply thread onto the end of the linear 
primed template (Bauer and Burgers, 1988; Tan et. al, 1986; O'Day et. al., 1992; 
20 Burgers and Yoder, 1 993). Hence, on linear primed templates, such as those 

generated in PCR, the beta clamp can simply slide over the DNA end. After the ring 
slides onto the end. the DNA polymerase can associate with the ring for enhanced 
DNA synthesis. 

Such "end assembly" is common among Pol Ill-type enzymes and has been 
25 demonstrated in the yeast and human systems. Rings assembling onto linear DNA for 
use by their respective DNA polymerases are shown in the following example 
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demonstrated in the E. coli bacterial system, in the human system, and in the T.th 
svstem. 



The bulk of the primed templates in PCR reactions are linear ssDNA fragments that 
are primed at their ends. However, these end primed linear fragments are not 
5 generated until after the first step of PCR has already been performed. In the very 
first step, PCR primers generally anneal at internal sites in a heat denatured ssDNA 
template. Primed linear templates are then generated in subsequent steps enabling use 
of this alternate path. For this first step, the clamp may be assembled onto an internal 
site in the absence of the clamp loader using special conditions that allow clamp 
1 0 assembly in the absence of a clamp loader. 



For example, a set of conditions that lead to assembly of the clamp onto circular DNA 
(i.e. internal primed sites) have been described in the protocol for the use of the 
bacteriophage T4 ring shaped clamp (gene 45 protein) without the clamp loader 
(Reddy et. aL, 1993). In this case, polyethylene glycol leads to "macromolecular 

15 crowding" such that the clamp and DNA are pushed together in close proximity 

leading to the ring self assembling onto internal primed sites on circular DNA, Other 
possible conditions that may lead to assembly of rings onto internal sites include use 
of a high concentration of beta such that use of heat or denaturant to break the dimeric 
ring into two half rings (crescents) followed by lowering the heat (or dilution or 

20 removal of denaturant) leading to rings assembling around the DNA. 



The ring shaped sliding clamps of E. coli and human slide over the end of linear DNA 
to activate their respective DNA polymerase in the absence of the clamp loader. This 
25 clamp loader independent assay is performed in the bacterial system in Fig.25 A. For 
this assay, the linear template is polydA primed with oligodT. The polydA is of 
average length 4500 nucleotides and was purchased from SuperTecs. 01igodT35 was 
synthesized by Oligos etc. The template was prepared using 145jal of 5.2 mM (as 
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nucleotide) poIydA and 22 p.; of 1.75 mM (as nucleotide) oligodT. The mixture was 
incubated in a final volume of 2100 \x: T.E. buffer (ratio as nucleotide was 21:1 
polydA to oligodT). The mixture was heated to boiling in a 1 ml Eppendorf tube, 
then removed and allowed to cool to room temperature. Assays were performed in a 
5 final volume of 25 yd 20 mM Tris-Cl (pH 7.5), 8 mM MgCl 2 , 5 mM DTT, 0.5 mM 
EDTA, 40 mg/ml BSA, 4% glycerol containing 20 nM [a- 32 P]dTTP, 0.1 ^g 
polydA-oligodT, 25 ng Pol III and, where present, 5 ug of 13 subunit. Proteins were 
added to the reaction on ice, then shifted to 37 °C for 5 min. DNA synthesis was 
quantitated using DE81 paper as described (Rowen and Kornberg, 1979). 

10 In the linear template assay, no ATP or dATP is provided and therefore, a clamp 
loader, even if present, is not active. Thus, the clamp (e.g. 13) can only stimulate the 
DNA polymerase provided the clamp threads onto the DNA (see diagram in Fig. 25). 
Hence 7 threading of the clamp is shown by a stimulation of the DNA polymerase. In 
lane 1 of Fig. 25 A. the DNA polymerase is incubated with the the linear DNA in the 

15 absence of the clamp, and lane 2 shows the result of adding, the clamp. The results 
show that the clamp is able to thread onto the DNA ends and stimulate the DNA 
polymerase in the absence of ATP and thus, in the absence of clamp loading as well. 

This clamp loader independent assay is performed in the human system in Fig. 25B. 
The assay reaction (25jil) contains 50 mM Tris-HCl (pH=7.8), 8 mM MgC12, 1 mM 

20 DTT, 1 mM creatine phosphate, 40 jig/ml bovine serum albumin, 0.55 [ig human 
SSB, 100 ng PCNA (where present), 7 units DNA polymerase delta (1 unit 
incorporates 1 pmol dTMP in 60 min.), 40 mM [a- 32 P]dTTP and 0.1 ng 
polydA-oligodT. Proteins were added to the reaction on ice, then shifted to 37° C for 
60 min. DNA synthesis was quantitated using DE81 paper as described (Rowen and 

25 Kornberg, 1979). In lane 3, (Fig. 25) the DNA polymerase 6 is incubated with the 
linear DNA in the absence of the clamp, and lane 4 showes the result of adding the 
PCNA clamp. The results demonstrate that the clamp is able to thread onto the DNA 
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ends and stimulate the DNA polymerase in the absence of ATP and thus, the absence 
of clamp loading. 

This clamp loader independent assay is performed in the T.th system in Fig. 25C. 
The assay reaction is exactly as described above for use of the E. coll Pol III and beta 
5 system except the temperature is 60 °C and here the Pol III is HEP.P1 T.th Pol III (0.5 
(j.1, providing 0. 1 units where one unit is equal to 1 pmol of dTTP incorporated in 1 
minute under these conditions and in the absence of beta), and the beta subunit is 7 p,g 
T.th B (from the MonoQ column). Proteins were added to the reaction on ice, then 
shifted to 37°C for 60 min. DNA synthesis was quantitated using DE81 paper as 
10 described (Rowen and Kornberg, 1979). In lane 3 (Fig. 25C), the T Th Pol III is 
incubated with the linear DNA in the absence of the clamp, and lane 4 shows the 
result of adding the T.th B clamp. The results demonstrate that the clamp is able to 
thread onto the DNA ends and stimulate the DNA polymerase in the absence of clamp 
loader activity. 

15 EXAMPLE 14 

Use of Tth Pol III in long chain primer extension 

A characteristic of Pol Ill-type enzymes is their ability to extend a single primer for 
several kilobases around a long (e.g. 7 kb) circular single stranded DNA genome of a 
bacteriophage. This reaction uses the circular B clamp protein. For the circular B to 

20 be assembled onto a circular DNA genome, the circular 6 must be opened, positioned 
around the DNA, then closed. This assembly of the circular beta around DNA 
requires the action of the clamp loader, which uses ATP to open and close the ring 
around DNA. In this example we use as a template the 7.2 kb circular single strand 
DNA genome of bacteriophage M13mpl8. This template was primed with a single 

25 DNA 57mer oligonucleotide and the Pol III enzyme was tested for conversion of this 
template to a double strand circular form (RFII). The reaction was supplimented with 
recombinant T.th B produced in E. coli. This assay is summarized in the scheme at 
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the top of Fig. 26. M 1 3mp 1 8 ssDNA was phenol extracted from phage purified as 
described (Turner and O'Donnell, 1995). M13mpl8 ssDNA was primed with a 57mer 
DNA oligomer synthesized by Oligos etc. The replication assays contained 73 ng 
singly primed M13mpl8 ssDNA and 100 ng T.th. B subunit in a 25 }A reaction 

5 containing 20 mM Tris-HCl (pH 7.5), 8 mM MgCk 40 (ig/ml BS A, 0. 1 mM EDTA, 
4% glycerol, 0.5 mM ATP, 60 yM each of dCTP, dGTP, dATP and 20 a- 32 P-TTP 
(specific activity 2.000-4,000 cpm/pmoi). Either T.th. Pol III from the Heparin, peak 
1 (HEP.P1: 5 [il 0.21 units where 1 unit equals 1 pmol nucleotide incorporated in 1 
min.) or a non-Pol III from the Heparin peak 2 (HEP.P2; 5 jil, 2.6 units) were added to 

10 the reaction. Reactions were shifted to 60 °C for 5 min., and then DNA synthesis was 
quenched upon adding 25 jil of 1% SDS, 40 mM EDTA. One half of the reaction was 
analyzed in a 0.8% native agarose gel, and the other half was quantitated using DE81 
paper as described (Studwell and O'Donnell, 1990). 

The results of the assay are shown in Fig. 26. Lane 1 is the result obtained using the 
1 5 T.th. Pol III (HEP.P 1 ) which was capable of extending the primer around the ssDNA 
circle to form RFII. Lane 2 shows the result of using the non-Pol III (HEP.P2) which 
was not capable of this extension and produced only incomplete DNA products (the 
result shown included 0.8 |j.g E. coli SSB which did not increase the chain length of 
the product. In the absence of SSB, the same product was observed, although the 
20 band contained more counts. The greater amount of total synthesis observed in lane 2 
is due to the build up of immature products in a small region of the gel. The presence 
of immature products in lane 1 is likely due to a contaminating polymerase in the 
preparation that can not convert the single primer to the full length RFII form. 
Alternatively, the presence of incomplete products in lane 1 (Pol III type enzyme) is 
25 due to secondary structure in the DNA which causes the Pol III to pause. In this case 
it may be presumed that performing the reaction at higher temperature could remove 
the secondary structure barrier. Alternatively, SSB (single strand binding protein) 
could be added to the assay (although T.th SSB would be needed since addition of E. 
coli SSB was tried and did not alter the quality of the product profile). Generally, SSB 
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is needed to remove secondary structure elements from ssDNA at 37 °C for complete 
extension of primers by mesophilic Pol Ill-type enzymes. 

The assay described above was performed at 60°C. The T.th. Pol III HEP.P1 gained 
activity as the temperature was increased from 37 °C to 60 °C, as expected for an 
5 enzyme from a thermophilic source. The E. coli Pol III lost activity at 60°C 
compared to 37°C as expected for an enzyme from a mesophilic source. 



The following is a list of documents related to the above disclosure and particularly to 
the experimental procedures and discussions. The documents should be considered as 
10 incorporated by reference in their entirety. 
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Gly Thr Gly Val Ala 
245 

Ala Glu Ala Leu Gly 
260 

Pro Arg Ser Leu Val 
275 

Tyr Ala Ala Phe Gly 
290 

Ala Leu lie Ala Ala 
305 

Ala Arg Arg Ser Asp 
325 

Gly Arg Ala Leu Ala 
340 



Glu lie Ala Ala Ser Leu 
250 

Leu Ala Arg Arg Leu Tyr 
265 

Ser Gly Leu Leu Glu Val 
280 

Leu Ala Gly Thr Pro Leu 
295 



Ala Arg Gly Lys Thr 
255 

Gly Glu Gly Tyr Ala 
270 

Phe Arg Glu Gly Leu 
285 

Pro Ala Pro Pro Gin 
300 



Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 
310 315 320 

Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 
330 335 

Ala Glu Ala Leu Pro Gin Pro Thr Gly Ala Pro 
345 350 



Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 
355 360 - 365 

Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe 
370 375 380 

Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg 
385 , 390 395 400 

Pro Glu Val Arg Glu Gly Gin Leu Cys Leu Ala Phe Pro Glu Asp Lys 
405 410 415 

Ala Phe His Tyr Arg Lys Ala Ser Glu Gin Lys Val Arg Leu Leu Pro 
420 425 430 

Leu Ala Gin Ala His Phe Gly Val Glu Glu Val Val Leu Val Leu Glu 
435 440 445 

Gly Glu Lys Lys Lys Ala 
450 

INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CGCAAGCTTC ACGCNTACCT NTTCTCCGGN AC 
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WHAT IS CLAIMED IS : 



2 1. A DNA Polymerase Ill-type enzyme found in a thermophilic bacterium which 

3 exhibits the following characteristics: 

4 the ability to extend a primer over a long stretch of ssDNA at elevated 

5 temperature; 

6 the ability to be stimulated by a cognate sliding clamp of the type that is 

7 assembled on DNA by a 'clamp' loader (e.g. y complex); 

8 contains associated clamp loading sub-units that show DNA stimulated 

9 ATPase activity at elevated temperature and/or ionic strength; and 

10 an accessory protein with DNA polymerase-associated 3-5' exonuclease 

1 1 activity. 

1 2. The DNA Polymerase Ill-type enzyme according to Claim 1 which is isolated 

2 from a thermophilic bacterium selected from Thermits and Thermatoga species. 

1 3. The DNA Polymerase Ill-type enzyme according to Claim 2, wherein the 

2 thermophilic bacterium comprises a member of the Thermus species. 

1 4. The DNA Polymerase Ill-type enzyme according to Claim 3, wherein the 

2 thermophilic bacterium comprises Thermus thermophilics. 

1 5. The DNA Polymerase Ill-type enzyme according to Claim 1, which comprises 

2 at least one of the following: 

3 A. ay subunit having an amino acid sequence selected from the formula 

4 set forth in SEQ ID NOS:4 and 5; 

5 B. ax subunit having an amino acid sequence corresponding to the 

6 formula set forth in SEQ ID NO:2; 

7 C. a e subunit having an amino acid sequence corresponding to the 

8 formula set forth in SEQ ID NO:95; 
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9 D. a a subunit comprising an amino acid sequence corresponding to the 

10 formula set forth in SEQ ID NO:87; 

11 E. a S subunit having an amino acid sequence corresponding to the 

12 formula set forth in SEQ ID NO:107; and 

13 variants, including allelic variants, muteins, analogs and fragments of any of 

14 subparts (A) through (E), and combinations thereof, capable of functioning in DNA 

1 5 amplification and sequencing. 

1 6. The DNA Polymerase Ill-type enzyme of Claim 1 , which includes a y sub-unit 

2 which exhibits a frameshift as great as -2. 

1 7. An isolated polynucleotide encoding a x subunit of a Thermits thermophilus 

2 DNA polymerase Ill-type enzyme, wherein said x subunit has a molecular weight of 

3 about 58,000 daltons as determined by SDS-PAGE under non-reducing conditions. 

18. An isolated polynucleotide according to Claim 7, wherein said amino acid 

2 residue sequence is represented by the formula shown in SEQ ID NO:2, analogs 

3 thereof, muteins thereof, alleles thereof, and active fragments thereof. 

1 9. An isolated polynucleotide according to Claim 7, wherein the 

2 polynucleotide sequence is the polynucleotide sequence of positions 132 to 

3 1 7 1 3 of SEQ ID NO: 1 , conserved variants thereof, analogs thereof, active fragments 

4 thereof, and combinations thereof. 

110. An isolated polynucleotide according to Claim 7, wherein the polynucleotide 

2 is the polynucleotide sequence of positions 1 to 2027 of SEQ ID NO: 1, conserved 

3 variants thereof, analogs thereof, active fragments thereof, and combinations thereof. 
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1 11. An isolated nucleic acid molecule encoding at least a single subunit of a DNA 

2 polymerase Ill-type enzyme found in a thermophilic bacterium, which nucleic acid 

3 molecule is selected from the group consisting of: 

4 A. dnaX: 

5 B. dnaO: 

6 C. cinaE: 

7 D. dnaN: 

8 F. variants, including conserved variants, analogs and fragments of any of 

9 subparts (A) through (D), and combinations thereof, capable of functioning in DNA 
1 0 amplification and sequencing. 

1 12. The isolated nucleic acid molecule according to Claim 1 1, wherein said 

2 nucleic acid molecule comprises dnaX. 

1 13. The isolated nucleic acid molecule according to Claim 1 1, wherein said 

2 nucleic acid molecule comprises dnaO, 

1 14. An isolated nucleic acid molecule associated with a DNA polymerase Ill-type 

2 enzyme found in a thermophilic bacterium, wherein said nucleic acid molecule 

3 comprises dnaA, 

1 15. The isolated nucleic acid molecule according to Claim 1 1 , wherein said 

2 nucleic acid molecule comprises dnaN. 

1 16. A subunit of a DNA polymerase Ill-type enzyme found in a thermophilic 

2 bacterium, which subunit has an amino acid sequence selected from the group 

3 consisting of SEQ ID NO:4; SEQ ID NO:5; SEQ ID NO:2; SEQ ID NO:95; SEQ ID 

4 NO:87; SEQ ID NO: 1 07; muteins thereof; alleles thereof; analogs thereof; active 

5 fragments thereof; and combinations thereof. 
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1 1 7. The subunit of Claim 1 6, wherein said subunit has an amino acid sequence 

2 selected from SEQ ID NO:4 and SEQ ID NO:5, and comprises the y subunit of a 

3 Thermits thermophilus DNA polymerase Ill-type enzyme. 

1 18. The subunit of Claim 16, wherein said subunit has an amino acid sequence set 

2 forth in SEQ ID NO:2, and comprises the x subunit of a Thermits thermophilus DNA 

3 polymerase Ill-type enzyme. 

1 19. The subunit of Claim 1 6, wherein said subunit has an amino acid sequence set 

2 forth in SEQ ID NO:95, and comprises the e subunit of a Thermus thermophilus DNA 

3 polymerase Ill-type enzyme. 

1 20. The subunit of Claim 16, wherein said subunit has an amino acid sequence set 

2 forth in SEQ ID NO:87, and comprises the a subunit of a Thermus thermophilus DNA 

3 polymerase Ill-type enzyme. 

1 21. The subunit of Claim 1 6, wherein said subunit has an amino acid sequence set 

2 forth in SEQ ID NO: 107, and comprises the B subunit of a Thermus thermophilus 

3 DNA polymerase Ill-type enzyme. 

1 22. A vector comprising an isolated nucleic acid molecule taken from Claim 1 1 . 

1 23. A vector comprising the isolated nucleic acid molecule of Claim 12. 

1 24. A vector comprising the isolated nucleic acid molecule of Claim 13. 

1 25. A vector comprising the isolated nucleic acid molecule of Claim 14. 

1 26. A vector comprising the isolated nucleic acid molecule of Claim 15. 
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1 27. A vector selected from pETdnaX and pETdnaN. 

1 28. A host cell comprising at least one of the vectors of Claim 22. 

1 29. The host cell according to Claim 28, wherein the host cell is a prokaryotic cell. 

1 30. A host cell comprising a vector according to Claim 23. 

1 31. A host cell comprising a vector according to Claim 24. 

1 32. A host cell comprising a vector according to Claim 25. 

1 33. A host cell comprising a vector according to Claim 26. 

1 34. The host cell according to Claim 30, wherein the host cell is a 

2 prokaryotic cell. 

1 35. The host cell according to Claim 31, wherein the host cell is a 

2 prokaryotic cell. 

1 36. The host cell according to Claim 32, wherein the host cell is a 

2 prokaryotic cell. 

1 37. The host cell according to Claim 33, wherein the host cell is a 

2 prokaryotic cell. 

1 38. An isolated DNA which codes for a recombinant DNA polymerase Ill-type 

2 enzyme, or subunit thereof, from a thermophilic bacterium, consisting essentially of a 

3 DNA fragment which hybridizes in a Southern blot to an isolated DNA fragment 

4 selected from the group consisting of the DNA fragments defined in SEQ ID NO:6 
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5 and the DNA fragments defined in SEQ ID NO:8, wherein hybridization is conducted 

6 under the following conditions: 

7 a) hybridization: 1% crystalline BSA (fraction V) (Sigma), 1 mM EDTA, 

8 0.5 M NaHP0 4 (pH 7.2), 7% SDS at 65 °C for 12 hours and; 

9 b) wash: 5 x 20 minutes with wash buffer consisting of 0.5% BSA, 
10 fraction V), ImM Na : EDTA. 40 mM NaHP0 4 (pH 7.2), and 5% SDS. 

1 39. A cloning vector comprising the isolated DNA of Claim 38. 

1 40. A host cell transformed by the vector of Claim 39. 

1 41. A method for producing a recombinant thermostable DNA polymerase Ill-type 

2 enzyme, or subunit therof, from a thermophilic bacterium comprising culturing a host 

3 cell transformed with the vector of Claim 39 under conditions suitable for the 

4 expression of said DNA polymerase Ill-type enzyme or said subunit. 

1 42. A DNA probe which hybridizes to the DNA sequence coding for the 

2 thermostable DNA polymerase Ill-type enzyme, or subunit therof, of Claim 38, 

3 wherein the DNA probe is selected from the group consisting of SEQ ID NO:6 and 

4 SEQIDNO:8. 

1 43. A method for isolating a target DNA fragment consisting essentially of 

2 a DNA coding for a thermostable DNA polymerase Ill-type enzyme, or subunit 

3 therof, from a thermophilic bacterium comprising the steps of: 

4 (a) forming a genomic library from the bacterium; 

5 (b) transforming or transfecting an appropriate host cell with the library of step 

6 (a); 

7 (c) contacting DNA from the transformed or transfected host cell with a DNA 

8 probe which hybridizes to a DNA fragment selected from the group consisting of the 
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9 DNA fragments defined in SEQ ID NO:6 and the DNA fragments defined in SEQ ID 

1 0 NO:8, wherein hybridization is conducted under the following conditions: 

1 1 i) hybridization: 1% crystalline BSA ( fraction V) (Sigma), 1 mM EDTA, 

12 0.5 M NaHP04 (pH 7.2). 7% SDS at 65 °C for 12 hours and; 

13 ii) wash: 5 x 20 minutes with wash buffer consisting of 0.5% BSA, 

14 fraction V), ImM Na : EDTA, 40 mM NaHPO, (pH 7.2), and 5% SDS; 

1 5 (d) assaying the transformed or transfected cell of step (c) which hybridizes to 

16 the DNA probe for DNA polymerase Ill-type activity; and 

17 (e) isolating a target DNA fragment which codes for the thermostable DNA 

1 8 polymerase Ill-type enzyme or subunit therof. 

1 44. An isolated DNA molecule encoding a protein subunit of DNA polymerase 

2 Ill-type enzyme from a thermophilic bacterium wherein the subunit group is selected 

3 from the group consisting of t, y at a -1 frameshift y at a -2 frameshift, e, a, and 8. 

1 45. The isolated DNA molecule according to Claim 44, .wherein the subunit is x 

2 and has a molecular weight of 58 kD. 

1 46. The isolated DNA molecule according to Claim 45, wherein the protein has an 

2 amino acid sequence corresponding to SEQ ID NO:2. 

1 47. The isolated DNA molecule according to Claim 45, wherein the DNA 

2 molecule has a nucleotide sequence corresponding to SEQ ID NO:3. 

1 48. The isolated DNA molecule according to Claim 44, wherein the subunit is y at 

2 a -1 frameshift, and has a molecular weight of 50.8 kD. 

1 49. The isolated DNA molecule according to Claim 46, wherein the protein has an 

2 amino acid sequence corresponding to SEQ ID NO:4. 
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1 50. The isolated DNA molecule according to Claim 44, wherein the subunit is y at 

2 a -2 frameshift, and has a molecular weight of 49.8 kD. 

1 51. The isolated DNA molecule according to Claim 47, wherein the protein has an 

2 amino acid sequence corresponding to SEQ ID NO: 5. 

1 52. An expression system comprising an isolated nucleic acid molecule according 

2 to Claim 1 1 . 

1 53. The expression system according to Claim 52, wherein the protein corresponds 

2 to x and has an amino acid sequence corresponding to SEQ ID NO:2. 

1 54. The expression system according to Claim 53, wherein the DNA molecule has 

2 a nucleotide sequence corresponding to SEQ ID NO: 3. 

1 55. The expression system according to Claim 52, wherein the protein corresponds 

2 to the £ subunit and has an amino acid sequence corresponding to SEQ ID NO:95. 

1 56. The expression system according to Claim 55, wherein said subunit has a 

2 nucleotide sequence corresponding to SEQ ID NO:94. 

1 57. The expression system according to Claim 52, wherein the protein corresponds 

2 to the a subunit and has an amino acid sequence corresponding to SEQ ID NO:87. 

1 58. The expression system according to Claim 57, wherein said subunit has a 

2 nucleotide sequence corresponding to SEQ ID NO:86. 

1 59. The expression system according to Claim 52, wherein the protein corresponds 

2 to the B subunit and has an amino acid sequence corresponding to SEQ ID NO: 107. 
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1 60. The expression system according to Claim 59, wherein said subunit has a 

2 nucleotide sequence corresponding to SEQ ID NO: 106. 

1 61. A host cell transformed with a heterologous nucleic acid molecule according 

2 to Claim 1 1 . 

1 62. The host cell according to Claim 61, wherein the protein has an amino acid 

2 sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 

3 NO:5 SEQ ID NO:87, SEQ ID NO:95, and SEQ ID NO:107. 

1 63. The host cell according to Claim 61, wherein the nucleic acid molecule has a 

2 nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID 

3 NO:3, SEQ ID NO:86. SEQ ID NO:94. and SEQ ID NO:106. 

1 64. A DNA probe which hybridizes to the DNA sequence coding for the 

2 thermostable DNA polymerase Ill-type enzyme of Claim 1^ wherein the DNA probe is 

3 selected from the group consisting of the oligonucleotide defined in SEQ ID NO:6; 

4 the oligonucleotide defined in SEQ ID NO:8; the oligonucleotide defined in SEQ ID 

5 NO: 10; the oligonucleotide defined in SEQ ID NO: 1 1 ; the oligonucleotide defined in 

6 SEQ ID NO:12; the oligonucleotide defined in SEQ ID NO:13; the oligonucleotide 

7 defined in SEQ ID NO:14; the oligonucleotide defined in SEQ ID NO:15, and the 

8 oligonucleotide defined in SEQ ID NO: 1 6. 

1 65. A method for isolating a target DNA fragment consisting essentially of 

2 a DNA coding for a thermostable DNA polymerase Ill-type enzyme, or subunit 

3 therof, from a thermophilic bacterium comprising the steps of: 

4 (a) forming a genomic library from the bacterium; 

5 (b) transforming or transfecting an appropriate host cell with the library of step 
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7 (c) contacting DNA from the transformed or transfected host cell with a DNA 

8 probe of Claim 6 1 . wherein hybridization is conducted under the following 

9 conditions: 

10 i) hybridization: 1% crystalline BSA (fraction V) (Sigma), 1 mM EDTA. 

11 0.5 M NaHP0 4 (pH 7.2), 7% SDS at 65 °C for 12 hours and; 

12 ii) wash: 5 x 20 minutes with wash buffer consisting of 0.5% BSA, 

1 3 fraction V), 1 mM Na : EDTA, 40 mM NaHP0 4 (pH 7.2), and 5% SDS; 

14 (d) assaying the transformed or transfected cell of step (c) which hybridizes to 

1 5 the DNA probe for DNA polymerase activity; and 

1 6 (e) isolating a target DNA fragment which codes for the thermostable DNA 

17 polymerase Ill-type enzyme or subunit therof. 

1 66. A method for amplifying a nucleic acid molecule, said method comprising 

2 contacting said nucleic acid molecule with a composition comprising the DNA 

3 polymerase Ill-type enzyme, or subunit thereof, of Claim 1 . 

1 67. A DNA molecule amplified by the method of Claim 66. 

1 68. A method of preparing a recombinant vector comprising inserting a nucleic 

2 acid molecule taken from Claim 1 1 into a vector. 

1 69. The method of Claim 68, wherein said vector is an expression vector. 

1 70. A recombinant vector made according to the method of Claim 68. 

1 71 . A method of making a recombinant host cell, comprising inserting the nucleic 

2 acid molecule of Claim 1 1 into a host cell. 




1 72. The method of Claim 71 , wherein said host cell is a bacterial cell, a yeast cell 

2 or an animal cell. 
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73. A kit for amplifying a nucleic acid molecule comprising a carrier and at least 
two containers, wherein at least the DNA polymerase Ill-type enzyme of Claim 1 is 
disposed in a first container, and the second container holds other reagent necessary or 
useful for amplifying said nucleic acid molecule. 
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(i) APPLICANT: Yurieva, Olga 
Kuriyan, John 
O'Donnell, Michael E. 
Jeruzalmi , David 

(ii) TITLE OF INVENTION: ENZYME DERIVED FROM THERMOPHILIC ORGANISMS 
THAT FUNCTIONS AS A CHROMOSOMAL REPLICASE, PREPARATION AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 116 

Civ) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: David A. Jackson, Esq, 

(B) STREET: 411 Hackensack Ave, Continental Plaza, 4th 

Floor 

£C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07601 

Cv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
{B> COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE : . Patent In Release #1.0, Version #1,30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US 
{B) FILING DATE: 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/143,202 

(B) FILING DATE: 08-APR-1997 
{ C } CLASSIFICATION : 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/823,407 

(B) FILING DATE: 08-APR-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Jackson Esq. , David A. 

(B) REGISTRATION NUMBER: 26,742 

(C) REFERENCE / DOCKET NUMBER: 600-1-179 PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 201-487-5800 

(B) TELEFAX: 201-343-1684 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Yurieva, Olga 
Kuriyan, John 
O'Donnell, Michael E. 
Jeruzalmi , David 

(ii) TITLE OF INVENTION: ENZYME DERIVED FROM THERMOPHILIC ORGANISMS 
THAT FUNCTIONS AS A CHROMOSOMAL REPLICASE, PREPARATION AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 116 

(iv) CORRESPONDENCE ADDRESS: 

(A} ADDRESSEE: David A. Jackson, Esq. 

{B) STREET: 411 Hackensack Ave, Continental Plaza, 4th 
Floor 

(C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP : 07601 

{v} COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
{B} COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE : - Patent In Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

{A} APPLICATION NUMBER: US 

(B) FILING DATE: 

(C ) CLASSIFICATION : 

(vii) PRIOR APPLICATION DATA: 

{A) APPLICATION NUMBER: US 60/143,202 

(B) FILING DATE: 08-APR-1997 

( C ) CLASSIFICATION : 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/823,407 

(B) FILING DATE: 08-APR-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Jackson Esq. , David A. 

(B) REGISTRATION NUMBER: 2 6,742 

(C) REFERENCE/DOCKET NUMBER: 600-1-179 N 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 201-487-5800 

(B) TELEFAX: 201-343-1684 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
TCCGGGGGTG GGGTTCCCAG GTAGACCCCG GCCCCTCCCG TGAGCCCCTT TACCCAGGCC 60 

GCCACCTCCT CCAGGGGGGC CAAGGCGTGC AAGGAGAGGA ACGTCCGCAC CACGCCCTAT 120 
ACTAGCCTTG TGAGCGCCCT CTACCGCCGC TTCCGCCCCC TCACCTTCCA GGAGGTGGTG 180 

GGGCAGGAGC ACGTGAAGGA GCCCCTCCTC AAGGCCATCC GGGAGGGGAG GCTCGCCCAG 240 
GCCTACCTCT TCTCCGGGCC CAGGGGCGTG GGCAAGACCA CCACGGCGAG GCTCCTCGCC 300 
ATGGCGGTGG GGTGCCAGGG GGAAGACCCC CCTTGCGGGG TCTGCCCCCA CTGCCAGGCG 360 
GTGCAGAGGG GCGCCCACCC GGACGTGGTG GACATTGACG CCGCCAGCAA CAACTCCGTG 420 
GAGGACGTGC GGGAGCTGAG GGAAAGGATC CACCTCGCCC CCCTCTCTGC CCCCAGGAAG 480 
GTCTTCATCC TGGACGAGGC CCACATGCTC TCCAAAAGCG CCTTCAACGC CCTCCTCAAG 540 
ACCCTGGAGG AGCCCCCGCC CCACGTCCTC TTCGTCTTCG CCACCACCGA GCC CGAGAGG 600 
ATGCCCCCCA CCATCCTCTC CCGCACCCAG CACTTCCGCT TCCGCCGCCT CACGGAGGAG 660 
GAGATCGCCT TTAAGCTCCG GCGCATCCTG GAGGCCGTGG GGCGGGAGGC GGAGGAGGAG 720 
GCCCTCCTCC TCCTCGCCCG CCTGGCGGAC GGGGCCCTTA GGGACGCGGA AAGCCTCCTG 7 80 

GAGCGCTTCC TCCTCCTGGA AGGCCCCCTC ACCCGGAAGG AGGTGGAGCG CGCCCTAGGC 840 
TCCCCCCCAG GGACCGGGGT GGCCGAGATC GCCGCCTCCC TCGC GAGGGG GAAAACGGCG 900 
GAGGCCCTGG GCCTCGCCCG GCGCCTCTAC GGGGAAGGGT ACGCCCCGAG GAGCCTGGTC 960 

TCGGGCCTTT TGGAGGTGTT CCGGGAAGGC CTCTACGCCG CCTTCGGCCT CGCGGGAACC 1020 

CCCCTTCCCG CCCCGCCCCA GGCCCTGATC GCCGCCATGA CCGCCCTGGA CGAGGCCATG 1080 

GAGCGCCTCG CCCGCCGCTC CGACGCCTTA AGCCTGGAGG TGGCCCTCCT GGAGGCGGGA 1140 

AGGGCCCTGG CCGCCGAGGC CCTACCCCAG CCCACGGGCG CTCCTTCCCC AGAGGTCGGC 1200 

CCCAAGCCGG AAAGCCCCCC GACCCCGGAA CCCCCAAGGC CCGAGGAGGC GCCCGACCTG 12 60 

CGGGAGCGGT GGCGGGCCTT CCTCGAGGCC CTCAGGCCCA CCCTACGGGC CTTCGTGCGG 1320 

GAGGCCCGCC CGGAGGTCCG GGAAGGCCAG CTCTGCCTCG CTTTCCCCGA GGACAAGGCC 1380 

TTCCACTACC GCAAGGCCTC GGAACAGAAG GTGAGGCTCC TCCCCCTGGC CCAGGCCCAT 1440 

TTCGGGGTGG AGGAGGTCGT CCTCGTCCTG GAGGGAGAAA AAAAAAGCCT GAGCCCAAGG 1500 

CCCCGCCCGG CCCCACCTCC TGAAGCGCCC GCACCCCCGG GCCCTCCCGA GGAGGAGGTA 1560 

GAGGCGGAGG AAGCGGCGGA GGAGGCCCCG GAGGAGGCCT TGAGGCGGGT GGTCCGCCTC 1620 
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CTGGGGGGGC GGGTGCTCTG GGTGCGGCGG CCCAGGACCC GGGAGGCGCC GGAGGAGGAA 1680 

CCCCTGAGCC AAGACGAGAT AGGGGGTACT GGTATATAAT GGGGGCATGA CGCGGACCAC 1740 

CGACCTCGGA CAAGAGACCG TGGACAACAT CCTCAAGCGC CTCCGCCGTA TTGAGGGCCA 1800 

GGTGCGGGGG CTCCAGAAGA TGGTGGCCGA GGGCCGCCCC TGCGACGAGG TCCTCACCCA 1860 

GATGACCGCC ACCAAGAAGG CCATGGAGGC GGCGGCCACC CTGATCCTCC ACGAGTTCCT 1920 

GAACGTCTGC GCCGCCGAGG TCTCCGAGGG CAAGGTGAAC CCCAAGAAGC CCGAGGAGAT 1980 

CGCCACCATG CTGAAGAACT TCATCTA 2007 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 529 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ser Ala Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gin Glu Val 
15 10 15 

Val Gly Gin Glu His Val Lys Glu Pro Leu Leu Lys Ala lie Arg Glu 
20 25 30 

Gly Arg Leu Ala Gin Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 
35 40 45 

Lys Thr Thr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gin Gly 
50 55 60 

Glu Asp Pro Pro Cys Gly Val Cys Pro His Cys Gin Ala Val Gin Arg 
65 70 75 80 

Gly Ala His Pro Asp Val Val Asp lie Asp Ala Ala Ser Asn Asn Ser 
85 90 95 

Val Glu Asp Val Arg Glu Leu Arg Glu Arg lie His Leu Ala Pro Leu 
100 105 110 

Ser Ala Pro Arg Lys Val Phe lie Leu Asp Glu Ala His Met Leu Ser 
115 120 125 

Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro 
130 135 140 

His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 
145 150 155 160 
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Thr lie Leu Ser Arg Thr Gin His Phe Arg Phe Arg Arg Leu Thr Glu 
165 170 175 

Glu Glu lie Ala Phe Lys Leu Arg Arg lie Leu Glu Ala Val Gly Arg 
180 185 190 

Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg Leu Ala Asp Gly 
195 200 205 

Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg Phe Leu Leu Leu Glu 
210 215 220 

Gly Pro Leu Thr Arg Lys Glu Val Glu Arg Ala Leu Gly Ser Pro Pro 
225 230 235 240 

Gly Thr Gly Val Ala Glu lie Ala Ala Ser Leu Ala Arg Gly Lys Thr 
245 250 255 

Ala Glu Ala Leu Gly Leu Ala Arg Arg Leu Tyr Gly Glu Gly Tyr Ala 
260 265 270 

Pro Arg Ser Leu Val Ser Gly Leu Leu Glu Val Phe Arg Glu Gly Leu 
275 280 . 285 

Tyr Ala Ala Phe Gly Leu Ala Gly Thr Pro Leu Pro Ala Pro Pro Gin 
290 295 300 

Ala Leu lie Ala Ala Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 
305 310 315 320 

Ala Arg Arg Ser Asp Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 
325 330 335 

Gly Arg Ala Leu Ala Ala Glu Ala Leu Pro Gin Pro Thr Gly Ala Pro 
340 345 350 

Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 
355 360 365 

Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe 
370 375 380 

Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg 
385 390 395 400 

Pro Glu Val Arg Glu Gly Gin Leu Cys Leu Ala Phe Pro Glu Asp Lys 
405 410 415 

Ala Phe His Tyr Arg Lys Ala Ser Glu Gin Lys Val Arg Leu Leu Pro 
420 425 430 

Leu Ala Gin Ala His Phe Gly Val Glu Glu Val Val Leu Val Leu Glu 
435 440 445 

Gly Glu Lys Lys Ser Leu Ser Pro Arg Pro Arg Pro Ala Pro Pro Pro 
450 455 460 

Glu Ala Pro Ala Pro Pro Gly Pro Pro Glu Glu Glu Val Glu Ala Glu 
465 470 475 480 

Glu Ala Ala Glu Glu Ala Pro Glu Glu Ala Leu Arg Arg Val Val Arg 
485 490 495 
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Leu Leu Gly Gly Arg Val Leu Trp Val Arg Arg Pro Arg Thr Arg Glu 
500 505 510 

Ala Pro Glu Glu Glu Pro Leu Ser Gin Asp Glu lie Gly Gly Thr Gly 
515 520 525 

He 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1590 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTGAGCGCCC TCTACCGCCG CTTCCGCCCC CTCACCTTCC AGGAGGTGGT GGGGCAGGAG 60 

CACGTGAAGG AGCCCCTCCT CAAGGCCATC CGGGAGGGGA GGCTCGCCCA GGCCTACCTC 120 

TTCTCCGGGC CCAGGGGCGT GGGCAAGACC ACCACGGCGA GGCTCCTCGC CATGGCGGTG 180 

GGGTGCCAGG GGGAAGACCC CCCTTGCGGG GTCTGCCCCC ACTGCCAGGC GGTGCAGAGG 240 

GGCGCCCACC CGGACGTGGT GGACATTGAC GCCGCCAGCA ACAACTCCGT GGAGGAC GTG 300 

CGGGAGCTGA GGGAAAGGAT CCACCTCGCC CCCCTCTCTG CCCCCAGGAA GGTCTTCATC 3 60 

CTGGACGAGG CCCACATGCT CTCCAAAAGC GCCTTCAACG CCCTCCTCAA GACCCTGGAG 420 

GAGCCCCCGC CCCACGTCCT CTTCGTCTTC GCCACCACCG AGCCCGAGAG GATGCCCCCC 480 

ACCATCCTCT CCCGCACCCA GCACTTCCGC TTCCGCCGCC TCACGGAGGA GGAGATCGCC 540 

TTTAAGCTCC GGCGCATCCT GGAGGCCGTG GGGCGGGAGG CGGAGGAGGA GGCCCTCCTC 600 

CTCCTCGCCC GCCTGGCGGA CGGGGCCCTT AGGGACGCGG AAAGCCTCCT GGAGCGCTTC 660 

CTCCTCCTGG AAGGCCCCCT C AC C CGGAAG GAGGTGGAGC GCGCCCTAGG CTCCCCCCCA 720 

GGGACCGGGG TGGCCGAGAT CGCCGCCTCC CTCGCGAGGG GGAAAACGGC GGAGGCCCTG 780 

GGCCTCGCCC GGCGCCTCTA CGGGGAAGGG TACGCCCCGA GGAGCCTGGT CTCGGGCCTT 840 

TTGGAGGTGT TCCGGGAAGG CCTCTACGCC GCCTTCGGCC TCGCGGGAAC CCCCCTTCCC 900 

GCCCCGCCCC AGGCCCTGAT CGCCGCCATG ACCGCCCTGG ACGAGGCCAT GGAGCGCCTC 960 

GCCCGCCGCT CCGACGCCTT AAGCCTGGAG GTGGCCCTCC TGGAGGCGGG AAGGGCCCTG 1020 

GCCGCCGAGG CCCTACCCCA GCCCACGGGC GCTCCTTCCC CAGAGGTCGG CCCCAAGCCG 1080 
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GAAAGCCCCC CGACCCCGGA ACCCCCAAGG CCCGAGGAGG CGCCCGACCT GCGGGAGCGG 1140 

TGGCGGGCCT TCCTCGAGGC CCTCAGGCCC ACCCTACGGG CCTTCGTGCG GGAGGCCCGC 1200 

CCGGAGGTCC GGGAAGGCCA GCTCTGCCTC GCTTTCCCCG AGGACAAGGC CTTCCACTAC 1260 

CGCAAGGCCT CGGAACAGAA GGTGAGGCTC CTCCCCCTGG CCCAGGCCCA TTTCGGGGTG 1320 

GAGGAGGTCG TCCTCGTCCT GGAGGGAGAA AAAAAAAGCC TGAGCCCAAG GCCCCGCCCG 1380 

GCCCCACCTC CTGAAGCGCC CGCACCCCCG GGCCCTCCCG AGGAGGAGGT AGAGGCGGAG 1440 

GAAGCGGCGG AGGAGGCCCC GGAGGAGGCC TTGAGGCGGG TGGTCCGCCT CCTGGGGGGG 1500 

CGGGTGCTCT GGGTGCGGCG GCCCAGGACC CGGGAGGCGC CGGAGGAGGA ACCCCTGAGC 1560 

CAAGACGAGA TAGGGGGTAC TGGTATATAA 1590 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 64 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Ala Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gin Glu Val 
15 10 15 

Val Gly Gin Glu His Val Lys Glu Pro Leu Leu Lys Ala lie Arg Glu 
20 25 30 

Gly Arg Leu Ala Gin Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 
35 40 45 

Lys Thr Thr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gin Gly 
50 55 60 

Glu Asp Pro Pro Cys Gly Val Cys Pro His Cys Gin Ala Val Gin Arg 
65 70 75 80 

Gly Ala His Pro Asp Val Val Asp lie Asp Ala Ala Ser Asn Asn Ser 
85 90 95 

Val Glu Asp Val Arg Glu Leu Arg Glu Arg He His Leu Ala Pro Leu 
100 105 110 

Ser Ala Pro Arg Lys Val Phe He Leu Asp Glu Ala His Met Leu Ser 
115 120 125 



Lys Ser Ala Phe Asn Ala Leu Leu 



Lys Thr Leu Glu Glu Pro Pro Pro 
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130 



135 



140 



His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 
145 150 155 160 

Thr lie Leu Ser Arg Thr Gin His Phe Arg Phe Arg Arg Leu Thr Glu 
165 170 175 

Glu Glu lie Ala Phe Lys Leu Arg Arg lie Leu Glu Ala Val Gly Arg 
180 185 190 

Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg Leu Ala Asp Gly 
195 200 205 

Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg Phe Leu Leu Leu Glu 
210 215 220 

Gly Pro Leu Thr Arg Lys Glu Val Glu Arg Ala Leu Gly Ser Pro Pro 
225 230 235 240 

Gly Thr Gly Val Ala Glu lie Ala Ala Ser Leu Ala Arg Gly Lys Thr 
245 250 255 

Ala Glu Ala Leu Gly Leu Ala Arg Arg Leu Tyr Gly Glu Gly Tyr Ala 
260 265 270 

Pro Arg Ser Leu Val Ser Gly Leu Leu Glu Val Phe Arg Glu Gly Leu 
275 280 285 

Tyr Ala Ala Phe Gly Leu Ala Gly Thr Pro Leu Pro Ala Pro Pro Gin 
290 295 300 

Ala Leu lie Ala Ala Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 
305 310 315 * 320 

Ala Arg Arg Ser Asp Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 
325 330 335 

Gly Arg Ala Leu Ala Ala Glu Ala Leu Pro Gin Pro Thr Gly Ala Pro 
340 345 350 

Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 
355 360 365 

Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe 
370 375 380 

Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg 
385 390 395 400 

Pro Glu Val Arg Glu Gly Gin Leu Cys Leu Ala Phe Pro Glu Asp Lys 
405 410 415 

Ala Phe His Tyr Arg Lys Ala Ser Glu Gin Lys Val Arg Leu Leu Pro 
420 425 430 

Leu Ala Gin Ala His Phe Gly Val Glu Glu Val Val Leu Val Leu Glu 
435 440 445 

Gly Glu Lys Lys Lys Pro Glu Pro Lys Ala Pro Pro Gly Pro Thr Ser 
450 455 460 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 454 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ser Ala Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gin Glu Val 
15 10 15 

Val Gly Gin Glu His Val Lys Glu Pro Leu Leu Lys Ala lie Arg Glu 
20 25 30 

Gly Arg Leu Ala Gin Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 
35 40 45 

Lys Thr Thr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gin Gly 
50 . 55 60 

Glu Asp Pro Pro Cys Gly Val Cys Pro His Cys Gin Ala Val Gin Arg 
65 70 75 80 

Gly Ala His Pro Asp Val Val Asp lie Asp Ala Ala Ser Asn Asn Ser 
85 90 95 

Val Glu Asp Val Arg Glu Leu Arg Glu Arg lie His Leu Ala Pro Leu 
100 105 110 

Ser Ala Pro Arg Lys Val Phe lie Leu Asp Glu Ala His Met Leu Ser 
115 120 125 

Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro 
130 135 140 

His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 
145 150 155 160 

Thr lie Leu Ser Arg Thr Gin His Phe Arg Phe Arg Arg Leu Thr Glu 
165 170 175 

Glu Glu He Ala Phe Lys Leu Arg Arg He Leu Glu Ala Val Gly Arg 
180 185 190 

Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg Leu Ala Asp Gly 
195 200 205 

Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg Phe Leu Leu Leu Glu 
210 215 220 

Gly Pro Leu Thr Arg Lys Glu Val Glu Arg Ala Leu Gly Ser Pro Pro 
225 230 235 240 
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Gly Thr Gly Val Ala 
245 

Ala Glu Ala Leu Gly 
260 

Pro Arg Ser Leu Val 
275 

Tyr Ala Ala Phe Gly 
290 

Ala Leu lie Ala Ala 
305 

Ala Arg Arg Ser Asp 
325 

Gly Arg Ala Leu Ala 
340 



Glu lie Ala Ala Ser Leu 
250 

Leu Ala Arg Arg Leu Tyr 
265 

Ser Gly Leu Leu Glu Val 
280 

Leu Ala Gly Thr Pro Leu 
295 



Ala Arg Gly Lys Thr 
255 

Gly Glu Gly Tyr Ala 
270 

Phe Arg Glu Gly Leu 
285 

Pro Ala Pro Pro Gin 
300 



Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 
310 315 320 

Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 
330 335 

Ala Glu Ala Leu Pro Gin Pro Thr Gly Ala Pro 
345 350 



Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 
355 360 - 365 

Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe 
370 375 380 

Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg 
385 , 390 395 400 

Pro Glu Val Arg Glu Gly Gin Leu Cys Leu Ala Phe Pro Glu Asp Lys 
405 410 415 

Ala Phe His Tyr Arg Lys Ala Ser Glu Gin Lys Val Arg Leu Leu Pro 
420 425 430 

Leu Ala Gin Ala His Phe Gly Val Glu Glu Val Val Leu Val Leu Glu 
435 440 445 

Gly Glu Lys Lys Lys Ala 
450 

INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CGCAAGCTTC ACGCNTACCT NTTCTCCGGN AC 



32 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS : 
CA) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

His Ala Tyr Leu Phe Ser Gly Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
CGCGAATTCG TGCTCNGGNG GCTCCTCNAG NGTC 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Thr Leu Glu Glu Pro Pro Glu His 
1 c 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 

GCGCGGATCC GGAGGGAGAA AAAAAAAGCC TCAGCCCA 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE:, other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
GCGCGGATCC GGAGGGAGAG AAGAAAAGCC TCAGCCCA 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
GAATTAAATT CGCGC TTCGG GAGGTGGG 
(2) INFORMATION FOR SEQ ID NO:13: 
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<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
<B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13 

GCGCGAATTC GCGCTTCGGG AGGTGGG 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
(BJ TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

( iii ) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCGCGAATTC GGGCGCTTCA GGAGGTGGG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 
GTGGTGCATA TGGTGAGCGC CCTCTACCGC C 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GTGGTGGTCG ACCCAGGAGG GCCACCTCCA G 
(2) INFORMATION FOR SEQ ID NO:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Gly Xaa Xaa Gly Xaa Gly Lys Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL : NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Lys Pro Asp Pro Lys Ala Pro Pro Gly Pro Thr Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
( i i i ) HYPOTHETICAL : NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ser Tyr Gin Val Leu Ala Arg Lys Trp Arg Pro Gin Thr Phe Ala 
15 10 15 

Asp Val Val Gly Gin Glu His Val Leu Thr Ala Leu Ala Asn Gly Leu 
20 25 30 

Ser Leu Gly Arg lie His His Ala Tyr Leu Phe Ser Gly Thr Arg Gly 
35 40 45 

Val Gly Lys Thr Ser lie Ala Arg Leu Leu Ala Lys Gly Leu Asn Cys 
50 55 60 

Glu Thr Gly lie Thr Ala Thr Pro Cys Gly Val Cys Asp Asn Cys Arg 
65 70 75 80 

Glu lie Glu Gin Gly Arg Phe Val Asp Leu He Glu He Asp Ala Ala 
85 90 95 

Ser Arg Thr Lys Val Glu Asp Thr Arg Asp Leu Leu Asp Asn Val Gin 
100 105 110 

Tyr Ala Pro Ala Arg Gly Arg Phe Lys Val Tyr Leu He Asp Glu Val 
115 120 125 

His Met Leu Ser Arg His Ser Phe Asn Ala Leu Leu Lys Thr Leu Glu 
130 135 140 

Glu Pro Pro Glu His Val Lys Phe Leu Leu Ala Thr Thr Asp Pro Gin 
145 150 155 160 

Lys Leu Pro Val Thr He Leu Ser Arg Cys Leu Gin Phe His Leu Lys 
165 170 175 

Ala Leu Asp Val 
180 

I) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
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(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Ser Tyr Gin Ala Leu Tyr Arg Val Phe Arg Pro Gin Arg Phe Glu 
15 10 15 

Asp Val Val Gly Gin Glu His lie Thr Lys Thr Leu Gin Asn Ala Leu 
20 25 30 

Leu Gin Lys Lys Phe Ser His Ala Tyr Leu Phe Ser Gly Pro Arg Gly 
35 40 45 

Thr Gly Lys Thr Ser Ala Ala Lys lie Phe Ala Lys Ala Val Asn Cys 
50 55 60 

Glu His Ala Pro Val Asp Glu Pro Cys Asn Glu Cys Ala Ala Cys Lys 
65 70 75 80 

Gly lie Thr Asn Gly Ser lie Ser Asp Val lie Glu lie Asp Ala Ala 
85 - 90 95 

Ser Asn Asn Gly Val Asp Glu lie Arg Asp lie Arg Asp Lys Val Lys 
100 105 110 

Phe Ala Pro Ser Ala Val Thr Tyr Lys Val Tyr lie lie Asp Glu Val 
115 , 120 125 

His Met Leu Ser lie Gly Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu 
130 135 140 

Glu Pro Pro Glu His Cys lie Phe lie Leu Ala Thr Thr Glu Pro His 
145 150 155 160 

Lys lie Pro Leu Thr lie lie Ser Arg Cys Gin Arg Phe Asp Phe Lys 
165 170 175 

Arg lie Thr Ser 
180 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 294 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Met Ser Tyr Gin Val Leu Ala Arg Lys Trp Arg Pro Gin Thr Phe Ala 
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1 



5 



10 



15 



Asp Val Val Gly Gin Glu His Val Leu Thr Ala Leu Ala Asn Gly Leu 
20 25 30 

Ser Leu Gly Arg Xle His His Ala Tyr Leu Phe Ser Gly Thr Arg Gly 
35 40 45 

Val Gly Lys Thr Ser lie Ala Arg Leu Leu Ala Lys Gly Leu Asn Cys 
50 55 60 

Glu Thr Gly lie Thr Ala Thr Pro Cys Gly Val Cys Asp Asn Cys Arg 
65 70 75 80 

Glu lie Glu Gin Gly Arg Phe Val Asp Leu lie Glu lie Asp Ala Ala 
85 90 95 

Ser Arg Thr Lys Val Glu Asp Thr Arg Asp Leu Leu Asp Asn Val Gin 
100 105 110 

Tyr Ala Pro Ala Arg Gly Arg Phe Lys Val Tyr Leu He Asp Glu Val 
115 120 125 

His Met Leu Ser Arg His Ser Phe Asn Ala Leu Leu Lys Thr Leu Glu 
130 135 140 

Glu Pro Pro Glu His Val Lys Phe Leu Leu Ala Thr Thr Asp Pro Gin 
145 150 155 160 

Lys Leu Pro Val Thr He Leu Ser Arg Cys Leu Gin Phe His Leu Lys 
165 170 175 

Ala Leu Asp Val Glu Gin He Arg His Gin Leu Glu His He Leu Asn 
180 185 * 190 

Glu Glu His He Ala His Glu Pro Arg Ala Leu Gin Leu Leu Ala Arg 
195 200 205 

Ala Ala Glu Gly Ser Leu Arg Asp Ala Leu Ser Leu Thr Asp Gin Ala 
210 215 220 

He Ala Ser Gly Asp Gly Gin Val Ser Thr Gin Ala Val Ser Ala Met 
225 230 235 240 

Leu Gly Thr Leu Asp Asp Asp Gin Ala Leu Ser Leu Val Glu Ala Met 
245 250 255 

Val Glu Ala Asn Gly Glu Arg Val Met Ala Leu He Asn Glu Ala Ala 
260 265 270 

Ala Arg Gly He Glu Trp Glu Ala Leu Leu Val Glu Met Leu Gly Leu 
275 280 285 

Leu His Arg He Ala Met 



290 



(2) INFORMATION FOR SEQ ID NO: 22: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 294 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

<v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Ser Tyr Gin Val Leu Ala Arg Lys Trp Arg Pro Lys Thr Phe Ala 
15 10 15 

Asp Val Val Gly Gin Glu His lie lie Thr Ala Leu Ala Asn Gly Leu 
20 25 30 

Lys Asp Asn Arg Leu His His Ala Tyr Leu Phe Ser Gly Thr Arg Gly 
35 40 45 

Val Gly Lys Thr Ser lie Ala Arg Leu Phe Ala Lys Gly Leu Asn Cys 
50 55 60 

Val His Gly Val Thr Ala Thr Pro Cys Gly Glu Cys Glu Asn Cys Lys 
65 70 75 80 

Ala lie Glu Gin Gly Asn Phe lie Asp Leu He Glu He Asp Ala Ala 
85 90 95 

Ser Arg Thr Lys Val Glu Asp Thr Arg Glu Leu Leu Asp Asn Val Gin 
100 105 110 

Tyr Lys Pro Val Val Gly Arg Phe Lys Val Tyr Leu lie Asp Glu Val 
115 120 125 

His Met Leu Ser Arg His Ser Phe Asn Ala Leu Leu Lys Thr Leu Glu 
130 135 140 

Glu Pro Pro Glu Tyr Val Lys Phe Leu Leu Ala Thr Thr Asp Pro Gin 
145 150 155 160 

Lys Leu Pro Val Thr He Leu Ser Arg Cys Leu Gin Phe His Leu Lys 
165 170 175 

Ala Leu Asp Glu Thr Gin He Ser Gin His Leu Ala His He Leu Thr 
180 185 190 

Gin Glu Asn He Pro Phe Glu Asp Pro Ala Leu Val Lys Leu Ala Lys 
195 200 205 

Ala Ala Gin Gly Ser He Arg Asp Ser Leu Ser Leu Thr Asp Gin Ala 
210 215 220 

He Ala Met Gly Asp Arg Gin Val Thr Asn Asn Val Val Ser Asn Met 
225 230 235 240 

Leu Gly Leu Leu Asp Asp Asn Tyr Ser Val Asp He Leu Tyr Ala Leu 
245 250 255 

His Gin Gly Asn Gly Glu Leu Leu Met Arg Thr Leu Gin Arg Val Ala 
260 265 270 
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Asp Ala Ala Gly Asp Trp Asp Lys Leu Leu Gly Glu Cys Ala Glu Lys 
275 280 285 

Leu His Gin lie Ala Leu 
290 

INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 294 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID*NO:23: 

Met Ser Tyr Gin Ala Leu Tyr Arg Val Phe Arg Pro Gin Arg Phe Glu 
15 10 15 

Asp Val Val Gly Gin Glu His lie Thr Lys Thr Leu Gin Asn Ala Leu 
20 25 30 

Leu Gin Lys Lys Phe Ser His Ala Tyr Leu Phe Ser Gly Pro Arg Gly 
35 40 45 

Thr Gly Lys Thr Ser Ala Ala Lys lie Phe Ala Lys Ala Val Asn Cys 
50 55 60 

Glu His Ala Pro Val Asp Glu Pro Cys Asn Glu Cys Ala Ala Cys Lys 
SS 70 75 80 

Gly lie Thr Asn Gly Ser lie Ser Asp Val He Glu He Asp Ala Ala 
85 90 95 

Ser Asn Asn Gly Val Asp Glu He Arg Asp He Arg Asp Lys Val Lys 
100 105 110 

Phe Ala Pro Ser Ala Val Thr Tyr Lys Val Tyr He He Asp Glu Val 
115 120 125 

His Met Leu Ser He Gly Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu 
130 135 140 

Glu Pro Pro Glu His Cys He Phe He Leu Ala Thr Thr Glu Pro His 
145 150 155 160 

Lys He Pro Leu Thr He He Ser Arg Cys Gin Arg Phe Asp Phe Lys 
165 170 175 

Arg He Thr Ser Gin Ala He Val Gly Arg Met Asn Lys He Val Asp 
180 185 190 

Ala Glu Gin Leu Gin Val Glu Glu Gly Ser Leu Glu He He Ala Ser 
195 200 205 
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Ala Ala His Gly Gly Met Arg Asp Ala Leu Ser Leu Leu Asp Gin Ala 
210 215 220 

lie Ser Phe Ser Gly Asp lie Leu Lys Val Glu Asp Ala Leu Leu lie 
225 230 235 240 

Thr Gly Ala Val Ser Gin Leu Tyr He Gly Lys Leu Ala Lys Ser Leu 
245 250 255 

His Asp Lys Asn Val Ser Asp Ala Leu Glu Thr Leu Asn Glu Leu Leu 
260 265 270 

Gin Gin Gly Lys Asp Pro Ala Lys Leu He Glu Asp Met He Phe Tyr 
275 280 285 

Phe Arg Asp Met Leu Leu 
290 

(2) INFORMATION FOR SEQ ID NO: 24: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Asp Ala Tyr Thr Val Leu Ala Arg Lys Tyr Arg Pro Arg Thr Phe Glu 
15 10 15 

Asp Leu He Gly Gin Glu Ala Met Val Arg Thr Leu Ala Asn Ala Phe 
20 25 30 

Ser Thr Gly Arg He Ala His Ala Phe Met Leu Thr Gly Val Arg Gly 
35 40 45 

Val Gly Lys Thr Thr Thr Ala Arg Leu Leu Ala Arg Ala Leu Asn Tyr 
50 55 60 

Glu Thr Asp Thr Val Lys Gly Pro Ser Val Asp Leu Thr Thr Glu Glv 
65 70 75 80 

Tyr His Cys Arg Ser He He Glu Gly Arg His Met Asp Val Leu Glu 
85 90 95 

Leu Asp Ala Ala Ser Arg Thr Lys Val Asp Glu Met Arg Glu Leu Leu 
100 105 HO 

Asp Gly Val Arg Tyr Ala Pro Val Glu Ala Arg Tyr Lys Val Tyr He 
115 120 125 

He Asp Glu Val His Met Leu Ser Thr Ala Ala Phe Asn Ala Leu Leu 
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130 



135 



140 



Lys Thr Leu Glu Glu Pro Pro Pro His Ala Lys Phe lie Phe Ala Thr 
145 150 155 160 

Thr Glu lie Arg Lys Val Pro Val Thr He Leu Ser Arg Cys Gin Arg 
165 170 175 

Phe Asp Leu Arg Arg Val Glu Pro Asp Val Leu Val Lys His Phe Asp 
180 185 190 

Arg lie Ser Ala Lys Glu Gly Ala Arg He Glu Met Asp Ala Leu Ala 
195 200 205 

Leu He Ala Arg Ala Ala Glu Gly Ser Val Arg Asp Gly Leu Ser Leu 
210 215 220 

Leu Asp Gin Ala He Val Gin Thr Glu Arg Gly Gin Thr Val Thr Ser 
225 230 235 240 

Thr Val Val Arg Asp Met Leu Gly Leu Ala Asp Arg Ser Gin Thr He 
245 250 255 

Ala Leu Tyr Glu His Val Met Ala Gly Lys Thr Lys Asp Ala Leu Glu 
260 265 270 

Gly Phe Arg Ala Leu Trp Gly Phe Gly Ala Asp Pro Ala Val Val Met 
275 280 285 

Leu Asp Val Leu Asp His Cys His Ala Ser Ala Val 
290 295 300 

("2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Met His Gin Val Phe Tyr Gin Lys Tyr Arg Pro He Asn Phe Lys Gin 
15 10 15 

Thr Leu Gly Gin Glu Ser He Arg Lys He Leu Val Asn Ala He Asn 

20 25 30 

Arg Asp Lys Leu Pro Asn Gly Tyr He Phe Ser Gly Glu Arg Gly Thr 
35 40 45 

Gly Lys Thr Thr Phe Ala Lys He He Ala Lys Ala He Asn Cys Leu 
50 55 60 
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Asn Trp Asp Gin lie Asp Val Cys Asn Ser Cys Asp Val Cys Lys Ser 
65 70 75 80 

lie Asn Thr Asn Ser Ala He Asp He Val Glu He Asp Ala Ala Ser 
85 90 95 

Lys Asn Gly He Asn Asp He Arg Glu Leu Val Glu Asn Val Phe Asn 
100 105 HO 

His Pro Phe Thr Phe Lys Lys Lys Val Tyr He Leu Asp Glu Ala His 
115 120 125 

Met Leu Thr Thr Gin Ser Trp Gly Gly Leu Leu Lys Thr Leu Glu Glu 
130 135 140 

Ser Pro Pro Tyr Val Leu Phe He Phe Thr Thr Thr Glu Phe Asn Lys 
145 150 155 160 

He Pro Leu Thr He Leu Ser Arg Cys Gin Ser Phe Phe Phe Lys Lys 
165 170 175 

He Thr Ser Asp Leu He Leu Glu Arg Leu Asn Asp He Ala Lys Lys 
180 185 190 

Glu Lys He Lys He Glu Lys Asp Ala Leu He Lys He Ala Asp Leu 
195 200 205 

Ser Gin Gly Ser Leu Arg Asp Gly Leu Ser Leu Leu Asp Gin Leu Ala 
210 215 220 

He Ser Leu He Val Lys Lys Leu Val Leu Leu Met Leu Lys Lys His 
225 230 235 240 

Leu He Ser Leu He Glu Met Gin Asn Leu Leu Leu Leu Lys Gin Phe 
245 250 255 

Tyr Gin Glu He 
260 

(2) INFORMATION FOR SSQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Val Ser Ala Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gin Glu Val 
15 10 15 

Val Gly Gin Glu His Val Lys Glu Pro Leu Leu Lys Ala He Arg Glu 
20 25 30 
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Gly Arg Leu Ala Gin Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 
35 40 45 

Lys Thr Thr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gin Gly 
50 55 60 

Glu Asp Pro Pro Cys Gly Val Cys Pro His Cys Gin Ala Val Gin Arg 
65 70 75 80 

Gly Ala His Pro Asp Val Val Asp He Asp Ala Ala Ser Asn Asn Ser 
85 90 95 

Val Glu Asp Val Arg Glu Leu Arg Glu Arg He His Leu Ala Pro Leu 
100 105 HO 

Ser Ala Pro Arg Lys Val Phe He Leu Asp Glu Ala His Met Leu Ser 
115 120 125 

Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro 
130 135 140 

His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 
145 150 ' 155 160 

Thr lie Leu Ser Arg Thr Gin His Phe Arg Phe Arg Arg Leu Thr Glu 
165 170 175 

Glu Glu He Ala Phe Lys Leu Arg Arg He Leu Glu Ala Val Gly Arg 
180 185 190 

Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg Leu Ala Asp Gly 
195 200 ^05 

Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg Phe Leu Leu Leu Glu 
210 215 220 

Gly Pro Leu Thr Arg Lys Glu Val Glu Arg Ala Leu Gly Ser Pro Pro 
225 230 235 240 

Gly Thr Gly Val Ala Glu He Ala Ala Ser Leu Ala Arg Gly Lys Thr 
245 250 255 

Ala Glu Ala Leu Gly Leu Ala Arg Arg Leu Tyr Gly Glu Gly Tyr Ala 
260 265 270 

Pro Arg Ser Leu Val Ser Gly Leu Leu Glu Val Phe Arg Glu Gly Leu 
275 280 285 

Tyr 



INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: RNA (genomic) 
(iii) HYPOTHETICAL: NO 



r 



r 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
GUCCUGGAGG GAGAAAAAAA AAGCCUGAGC CCAAGGCCCC GCCCGGCCCC ACCUCCUGAA 
GCGCCCGCAC CCCCGGGCCC UCCCGAGGAG GAGGUAGAGG C 
{2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 



<v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Val Leu Glu Gly Glu Lys Lys Ser Leu Ser Pro 
15 10 

"(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CACGCNTACC TNTTCTCCGG NAC 
(2) INFORMATION FOR SEQ ID NO: 30: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "Primer" 



(iii) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
GTGCTCNGGN GGCTCCTCNT CNGTC 



(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
{A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
GTGGGATCCG TGGTTCTGGA TCTCGATGAA GAA 
{2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER" 

( iii ) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
GTGGGATCCA CGGSCTSTCS GAGCAGAAG 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER * 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
GCGGGATCCT CAACGAGGAC CTCTCCATCT TCAA 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi} SEQUENCE DESCRIPTION: SEQ ID"NO:34 

GCGGGATCCT TGTCGTCSAG SGTSAGSGCG TCGTA 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 
GGGAAGGACC AGCGCGTACT CCCCCTGCTC CTAGGTGTG 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 
GTGTGGATCC TTCTTCTTSC CCATSGC 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc » " PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 
CACCGATTCC AGTGGTGCCT AGGTGTG 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 8 
CAACACCTGG TGTTCCAGGA GCCTGTGCTT 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
CCAGAATCGT CTGCTGGTCG TAG 
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(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : single 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
AGCACCCTGG AGGAGCTTC 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: • other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 
CATGTCGTAC TGGGTGTAC 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
GTSGTSNNSG ACNNSGAGAC SACSGGG 
(2) INFORMATION FOR SEQ ID NO: 43: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GAASCC SNNG TCGAASNNGG CGTTGTG 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER" 

(iii) HYPOTHETICAL: "NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
CGGGGATCCA CCTCAATCAC CTCGTGG 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
CGGGGATCCG CCACCTTGCG GCTCCGGGTG 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
GCGCTCTAGA CGAGTTCCCA AAGCGTGCGG T 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic "acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 
CGCGTCTAGA TCACCTGTAT CCAGA 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 
GCGGCGCATA TGGTGGTGGT CCTGGACCTG GAG 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 
CGCGTCTAGA TCAC CTGTAT CCAGA 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 
GTSCTSGTSA AGACSCACTT 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
SAGSAGSGCG TTGAASGTGT G 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
CTCGTTGGTG AAAGTTTCCG TG 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
CGTCCAGTTC ATCGCCGGAA AGGA 
( 2 ) INFORMATION FOR SEQ ■ ID NO : 5 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 
TCTGGCAACA CGTTCTGGAG CACATCC 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 



(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 
TGCTGGCGTT CATCTTCAGG ATG 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 
(A) DESCRIPTION; /desc = "PRIMER" 

{iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 
CATCCTGAAG ATGAACGCCA GCA 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
AGGTTATCCA CAGGGGTCAT GTGCA 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
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GTGTGTCATA TGAACATAAC GGTTCCCAA 29 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 
GCGCGAATTC TCCCTTGTGG AAGGCTTAG 29 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 
CD) TOPOLOGY :* linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Arg Val Glu Leu Asp Tyr Asp Ala Leu Thr Leu Asp Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Phe Phe lie Glu lie Gin Asn His Gly Leu Ser Glu Gin Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 

Phe Phe lie Glu lie Gin Asn His 
1 5 

(2) INFORMATION FOR SEQ ID NO:63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Tyr Asp Ala Leu Tnr Leu Asp Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Ala Met Gly Lys Lys Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Phe Asn Lys Ser His Ser Ala Ala Tyr 
1 5 

INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Val Val Xaa Asp Xaa Glu Thr Thr Gly 
1 5 

INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : * 1 inear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 

His Asn Ala Xaa Phe Asp Xaa Gly Phe 
1 5 

INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 

Val Val Xaa Asp Xaa Glu Thr Thr Gly 
1 5 

INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Val Leu Val Lys Thr His Leu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 

His Arg Ala Leu Tyr Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 

His Thr Phe Asn Ala Leu Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Asp Arg Tyr Phe Leu Glu Leu lie Arg Thr Gly Arg Pro Asp Glu Glu 
15 10 15 

Ser Tyr Leu His Ala Ala Val Glu Leu Ala Glu Ala Arg Gly Leu Pro 
20 25 30 

Val Val 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 
{ B ) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 

Asp His Phe Tyr Leu Glu Leu lie Arg Thr Gly Arg Ala Asp Glu Glu 
15 10 15 

Ser Tyr Leu His Phe Ala Leu Asp Val Ala Glu Gin Tyr Asp Leu Pro 
20 25 30 

Val Val 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 
{ B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Asp His Phe Tyr Leu Ala Leu Ser Arg Thr Gly Arg Pro Asn Glu Glu 
1 5 10 15 

Arg Tyr lie Gin Ala Ala Leu Lys Leu Ala Glu Arg Cys Asp Leu Pro 
20 25 30 

Leu Val 



(2) INFORMATION FOR SEQ ID NO: 75: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Asp Arg Phe Tyr Phe Glu lie Met Arg His Asp Leu Pro Glu Glu Gin 
15 10 15 

Phe lie Glu Asn Ser Tyr lie Gin lie Ala Ser Glu Leu Ser lie Pro 
20 25 30 

lie Val 



INFORMATION FOR SEQ ID NO: 76: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 'protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Asp Asp Phe Tyr Leu Glu lie Met Arg His Gly lie Leu Asp Gin Arg 
15 10 15 

Phe lie Asp Glu Gin Val lie Lys Met Ser Leu Glu Thr Gly Leu Lys 
20 25 30 

lie lie 



INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

Asp Asp Tyr Tyr Leu Glu lie Gin Asp His Gly Ser Val Glu Asp Arg 
15 10 15 
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Leu Val Asn He Asn Leu Val Lys He Ala Gin Glu Leu Asp He Lys 
20 25 30 



lie Val 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 4 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Asp Asn Tyr Phe Leu Glu Leu Met Asp His Gly Leu Thr He Glu Arg 
1 5 10 15 

Arg Val Arg Asp Gly Leu Leu Glu He Gly Arg Ala Leu Asn He Pro 
20 25 30 

Pro Leu 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Asn Lys Arg Arg Ala Lys Asn Gly Glu Pro Pro Leu Asp He Ala Ala 
1,5 10 15 

He Pro Leu Asp Asp Lys Lys Ser Phe Asp Met Leu Gin Arg Ser Glu 
20 25 30 

Thr Thr Ala Val Phe Gin Leu Glu Ser Arg Gly Met Lys Asp 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
( D } TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: protein 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Asn Pro Arg Leu Lys Lys Ala Gly Lys Pro Pro Val Arg lie Glu Ala 
15 10 15 

lie Pro Leu Asp Asp Ala Arg Ser Phe Arg Asn Leu Gin Asp Ala Lys 
20 25 30 

Thr Thr Ala Val Phe Gin Leu Glu Ser Arg Gly Met Lys Glu 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Asn Val Arg Met Val Arg Glu Gly Lys Pro Arg Val Asp lie Ala Ala 
15 10 15 

lie Pro Leu Asp Asp Pro Glu Ser Phe Glu Leu Leu Lys Arg Ser Glu 
20 25 30 

Thr Thr Ala Val Phe Gin Leu Glu Ser Arg Gly Met Lys Asp 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Cys Lys Lys Leu Leu Lys Glu Gin Gly lie Lys lie Asp Phe Asp Asp 
15 10 15 

Met Thr Phe Asp Asp Lys Lys Thr Tyr Gin Met Leu Cys Lys Gly Lys 
20 25 30 

Gly Val Gly Val Phe Gin Phe Glu Ser lie Gly Met Lys Asp 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Leu Lys lie lie Lys Thr Gin His Lys lie Ser Val Asp Phe Leu Ser 
15 10 15 

Leu Asp Met Asp Asp Pro Lys Val Tyr Lys Thr lie Gin Ser Gly Asp 
20 25 30 

Thr Val Gly lie Phe Gin lie Glu Ser Gly Met Phe Gin 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Gin Glu Arg Lys Ala Leu Gin lie Arg Ala Arg Thr Gly Ser Lys Lys 
15 10 15 

Leu Pro Asp Asp Val Lys Lys Thr His Lys Leu Leu Glu Ala Gly Asp 
20 25 30 

Leu Glu Gly lie Phe Gin Leu Glu Ser Gin Gly Met Lys Gin 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 
{B} TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
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lie Asp Asn Val Arg Ala Asn Arg Gly lie Asp Leu Asp Leu Glu Ser 
15 10 15 

Val Pro Leu Asp Asp Lys Ala Thr Tyr Glu Leu Leu Gly Arg Gly Asp 
20 25 30 

Thr Leu Gly Val Phe Gin Leu Asp Gly Gly Pro Met Arg Asp 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1141 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

ATGGGCCGGG AGCTCCGCTT CGCCCACCTC CACCAGCACA CCCAGTTCTC CCTCCTGGAC 60 

GGGGCGCCGA AGCTTTCCGA CCTCCTCAAG TGGGTGGAGG AGACGACCCC CGAGGACCCC 120 

GCCTTGGCCA TGACCGACCA CGGCAACCTC TTCGGGGCCG TAGAGTTCTA CAAGAAGGCC 180 

GCCGAAATGG GCATCGAGCC CATCCTGGGT ACGAGGCCTT ACGTGGCGGC GGAAAGCCCG 240 

TTTGACCGCA AGCGGGGAAA GGGCCTAGAC GGGGGCTACT TTCACCTCAC CCTCCTCGCC 3 00 

AAGGACTTCA CGGGGTACCA GAACCTGGTG CGCCTGGCGA GCCGGGCTTA CCTGGAGGGG 3 60 

TTTTACGAAA AGCCCCGGAT TGACCGGGAG ATCCTGCGCG AGCGCCGAGG GCCTCATCGC 420 

CTCTCGGGGT GCCTCGGGGC GGAGATCCCC CAGTTCATCC TCCAGGACCG TCTGGACCTG 4 80 

GCCGAGGCCC GGCTCAACGA GGACCTCTCC ATCTTCAAGG ACCGCTTCTT CATTCACATC 540 

CAGAACCACG GCCTCCCCGA GCAGAAAAAG GTCAACGAGG TCCTCAAGGA GTTCGCCCGA 600 

AAGTACGGCC TGGGGATGGT GGCCACCAAC GACGGCCATT ACGGGAGGAA GGAGGCCCGC 660 

AGCGCCCACG AGGTTTTCCT CGCCATCCAG TCCAAGAGCA CCCTGGACGA CCCCGGGGCC 720 

GTTGGCTTTC CCCTGCGGGA GTTCTACGTG AAGACCCCCG AGGAGACGTG CGGGCCGGTG 780 

TTCCCCGAGG AGGAGTGGGG GGACGAGCCC TTTGACAACA CCGTGGAGAT CGCCCGCATG 840 

TGCAACGTGG AGCTGCCCAT CGGGACAAGA TGGTCTACCC GAATCCCCCG CTTCCCCCTC 900 

CCCGAGGGAC CGGGGACCGA GGCCAAGTAC CTAATGGAGC TAACCTTCAA GGGGCCCCTC 9 60 

CGCCGTTACC CGGACCGAAT C AC CGAGGGT TTCTACCGGG AGGTTTTCCG CCTTTTGGGG 1020 

AAGCTTCCCC CCCACGGGCA CGGGGAGGCC TTGGCCGAGG CCTTGGCCCA GGTGGAGCGG 1080 

GAGGCTTGGG AGAGGCTCAT GAAGAGCCTC CCCCCCTTTG ACCGGGGTCC AAGGAGTTCC 1140 

A 1141 
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(2) INFORMATION FOR SEQ ID NO: 87: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 394 amino acids 

( B ) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Met Gly Arg Glu Leu Arg Phe Ala His Leu His Gin His Thr Gin Phe 
15 10 15 

Ser Leu Leu Asp Gly Ala Pro Lys Leu Ser Asp Leu Leu Lys Trp Val 
20 25 30 

Glu Glu Thr Thr Pro Glu Asp Pro Ala Leu Ala Met Thr Asp His Gly 
35 40 ' 45 

Asn Leu Phe Gly Ala Val Glu Phe Tyr Lys Lys Ala Ala Glu Met Gly 
50 55 60 

lie Glu Pro lie Leu Gly Thr Arg Pro Tyr Val Ala Ala Glu Ser Pro 
65 ' 70 75 80 

Phe Asp Arg Lys Arg Gly Lys Gly Leu Asp Gly Gly Tyr Phe His Leu 
85 90 . 95 

Thr Leu Leu Ala Lys Asp Phe Thr Gly Tyr Gin Asn Leu Val Arg Leu 
100 105 110 

Ala Ser Arg Ala Tyr Leu Glu Gly Phe Tyr Glu Lys Pro Arg lie Asp 
115 120 125 

Arg Glu lie Leu Arg Glu Arg Arg Gly Pro His Arg Leu Ser Gly Cys 
130 135 140 

Leu Gly Ala Glu lie Pro Gin Phe lie Leu Gin Asp Arg Leu Asp Leu 
145 150 155 160 

Phe Phe lie Glu lie Gin Asn His Gly Leu Ser Glu Gin Lys Ala Glu 
165 170 175 

Ala Arg Leu Asn Glu Asp Leu Ser lie Phe Lys Asp Arg Phe Phe lie 
180 185 190 

His lie Gin Asn His Gly Leu Pro Glu Gin Lys Lys Val Asn Glu Val 
195 200 205 

Leu Lys Glu Phe Ala Arg Lys Tyr Gly Leu Gly Met Val Ala Thr Asn 
210 215 220 

Asp Gly His Tyr Gly Arg Lys Glu Ala Arg Ser Ala His Glu Val Phe 
225 230 235 240 

Leu Ala lie Gin Ser Lys Ser Thr Leu Asp Asp Pro Gly Ala Val Gly 



245 



250 



255 
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Phe Pro Leu Arg Glu Phe Tyr Val Lys Thr Pro Glu Glu Thr Cys Gly 
260 ^ 265 270 

Pro Val Phe Pro Glu Glu Glu Trp Gly Asp Glu Pro Phe Asp Asn Thr 
275 280 285 

Val Glu lie Ala Arg Met Cys Asn Val Glu Leu Pro lie Gly Thr Arg 
290 295 300 

Trp Ser Thr Arg lie Pro Arg Phe Pro Leu Pro Glu Gly Pro Gly Thr 
305 310 315 320 

Glu Ala Lys Tyr Leu Met Glu Leu Thr Phe Lys Gly Pro Leu Arg Arg 
325 330 335 

Tyr Pro Asp Arg lie Thr Glu Gly Phe Tyr Arg Glu Val Phe Arg Leu 
340 345 350 

Leu Gly Lys Leu Pro Pro His Gly His Gly Glu Ala Leu Ala Glu Ala 
355 360 365 

Leu Ala Gin Val Glu Arg Glu Ala Trp Glu Arg Leu Met Lys Ser Leu 
370 375 ' 380 

Pro Pro Phe Asp Arg Gly Pro Arg Ser Ser 
385 390 

(2) INFORMATION FOR SEQ ID NO; 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Val Glu Arg Val Val Arg Thr Leu Leu Asp Gly Arg Phe Leu Leu Glu 
15 10 15 

Glu Gly Val Gly Leu Trp Glu Trp Arg Tyr Pro Phe Pro Leu Glu Gly 
20 25 30 

Glu Ala Val Val Val Leu Asp Leu Glu Thr Thr Gly Leu Ala Gly Leu 
35 40 45 

Asp Glu Val lie Glu Val Gly Leu Leu Arg Leu Glu Gly Gly Arg Arg 
50 55 60 

Leu Pro Phe Gin Ser Leu Val Arg Pro Leu Pro Pro Ala Glu Ala Arg 
65 70 75 80 

Ser Trp Asn Leu Thr Gly lie Pro Arg Glu Ala Leu Glu Glu Ala Pro 



85 



90 



95 



Ser Leu Glu Glu Val Leu Glu Lys Ala Tyr Pro Leu Arg Gly Asp Ala 
100 105 110 
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Thr Leu Val lie His Asn Ala Ala Phe Asp Leu Gly Phe Leu Arg^Pro 
115 120 125 

Ala Leu Glu Gly Leu Gly Tyr Arg Leu Glu Asn Pro Val Val Asp Ser 
130 135 140 

Leu Arg Leu Ala Arg Arg Gly Leu Pro Gly Leu Arg Arg Tyr Gly Leu 
145 150 155 160 

Asp Ala Leu Ser Glu Val Leu Glu Leu Pro Arg Arg Thr Cys His Arg 
165 170 175 

Ala Leu Glu Asp Val Glu Arg Thr Leu Ala Val Val His Glu Val Tyr 
180 185 190 

Tyr Met Leu Thr Ser Gly 
195 

INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: 

Pro Trp Pro Gin Asp Val Val Val Phe Asp Leu Glu Thr Thr Gly Phe 
15 10 15 

Ser Pro Ala Ser Ala Ala lie Val Glu lie Gly Ala Val Arg lie Val 
20 25 30 

Gly Gly Gin lie Asp Glu Thr Leu Lys Phe Glu Thr Leu Val Arg Pro 
35 40 45 

Thr Arg Pro Asp Gly Ser Met Leu Ser lie Pro Trp Gin Ala Gin Arg 
50 55 60 

Val His Gly lie Ser Asp Glu Met Val Arg Arg Ala Pro Ala Xaa Lys 
65 70 75 80 

Asp Val Leu Pro Asp Phe Phe Asp Phe Val Asp Gly Ser Ala Val Val 
85 90 95 

Ala His Asn Val Ser Phe Asp Gly Gly Phe Met Arg Ala Gly Ala Glu 
100 105 110 

Arg Leu Gly Leu Ser Trp Ala Pro Glu Arg Glu Leu Cys Thr Met Gin 
115 120 125 

Leu Ser Arg Arg Ala Phe Pro Arg Glu Arg Thr His Asn Leu Thr Val 
130 135 140 

Leu Ala Glu Arg Leu Gly Leu Glu Phe Ala Pro Gly Gly Arg His Arg 
145 150 155 160 
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Ser Tyr Gly Asp Val Gin Val Thr Ala Gin Ala Tyr Leu Arg Leu Leu 
165 * 170 175 

Glu Leu Leu Gly Glu Arg 
180 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:90: 

His Gly lie Lys Met He Tyr Gly Met Glu Ala Asn Leu Val Asp Asp 
1 5 '10 15 

Gly Val Pro He Ala Tyr Asn Ala Ala His Arg Leu Leu Glu Glu Glu 
20 25 30 

Thr Tyr Val Val Phe Asp Val Glu Thr Thr Gly Leu Ser Ala Val Tyr 
35 40 45 

Asp Thr He He Glu Leu Ala Ala Val Lys Val Lys Gly Gly Glu He 
50 55 60 

He Asp Lys Phe Glu Ala Phe Ala Asn Pro His Arg Pro Leu Ser Ala 
65 70 75 80 

Thr He He Glu Leu Thr Gly He Thr Asp Asp Met Leu Gin Asp Ala 
85 90 95 

Pro Asp Val Val Asp Val He Arg Asp Phe Arg Glu Trp He Gly Asp 
100 105 110 

Asp He Leu Val Ala His Asn Ala Ser Phe Asp Met Gly Phe Leu Asn 
115 120 125 

Val Ala Tyr Lys Lys Leu Leu Glu Val Glu Lys Ala Lys Asn Pro Val 
130 135 140 

He Asp Thr Leu Glu Leu Gly Arg Phe Leu Tyr Pro Glu Phe Lys Asn 
145 150 155 160 

His Arg Leu Asn Thr Leu Cys Lys Lys Phe Asp He Glu Leu Thr Gin 
165 170 175 

His His Arg Ala He Tyr Asp Thr Glu Ala Thr Ala Tyr Leu Leu Leu 
180 185 190 

Lys Met Leu Lys Asp Ala Ala Glu Lys 
195 200 



(2) INFORMATION FOR SEQ ID NO: 91: 



r 



r 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Met lie Asn Pro Asn Arg Gin lie Val Leu Asp Thr Glu Thr Thr Gly 
15 10 15 

Met Asn Gin Leu Gly Ala His Tyr Glu Gly His Cys lie lie Glu lie 
20 25 30 

Gly Ala Val Glu Leu lie Asn Arg Arg Tyr Thr Gly Asn Asn Xaa His 
35 40 45 

lie Tyr lie Lys Pro Asp Arg Pro Xaa Asp Pro Asp Ala lie Lys Val 
50 55 60 

His Gly lie Thr Asp Glu Met Leu Ala Asp Lys Pro Glu Phe Lys Glu 
65 70 75 80 

Val Ala Gin Asp Phe Leu Asp Tyr lie Asn Gly Ala Glu Leu Leu lie 
85 90 95 

His Asn Ala Pro Phe Asp Val Gly Phe Met Asp Tyr Glu Phe Arg Lys 
100 105 ^ 110 

Leu Asn Leu Asn Val Lys Thr Asp Asp lie Cys Leu Val Thr Asp Thr 
115 120 125 

Leu Gin Met Ala Arg Gin Met Tyr Pro Gly Lys Arg Asn Asn Leu Asp 
130 135 140 

Ala Leu Cys Asp Arg Leu Gly lie Asp Asn Ser Lys Arg Thr Leu His 
145 150 155 160 

Gly Ala Leu Leu Asp Ala Glu lie Leu Ala Asp Val Tyr Leu Met Met 
165 170 175 

Thr Gly Gly Gin Thr Asn Leu Phe Asp Glu Glu Glu 
180 185 

INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
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Met Ser Thr Ala lie Thr Arg Gin He Val Leu Asp Thr Glu Thr Thr 
1 5 " 10 15 

Gly Met Asn Gin He Gly Ala His Ser Glu Gly His Lys He He Glu 
20 25 30 

He Gly Ala Val Glu Val Val Asn Arg Arg Leu Thr Gly Asn Asn Phe 
35 40 45 

His Val Tyr Leu Lys Asp Arg Leu Val Asp Pro Glu Ala Phe Gly Val 
50 55 60 

His Gly He Ala Val Asp Phe Leu Leu Asp Lys Pro Thr Phe Ala Glu 
65 70 75 80 

Val Ala Val Glu Phe Met Asp Tyr He Arg Gly Ala Glu Leu Val He 
85 90 95 

His Asn Ala Ala Phe Asp He Gly Phe Met Asp Tyr Glu Phe Ser Leu 
100 105 110 

Leu Lys Arg Asp He Ala Lys Thr Asn Thr Phe Cys Lys Val Thr Asp 
115 120 ' 125 

Ser Leu Ala Val Ala Arg Lys Met Phe Pro Gly Lys Arg Asn Ser Leu 
130 135 140 

Asp Ala Leu Cys Ala Arg Tyr Glu He Asp Asn Ser Lys Arg Thr Leu 
145 ■ 150 155 160 

His Gly Ala Leu Leu Asp Ala Gin He Leu Ala Glu Val Tyr Leu Ala 
165 170 m 175 

Met Thr Gly Gly Gin Thr Ser Met Ala Phe Ala Met Glu 
180 185 

{2) INFORMATION FOR SEQ ID NO: 93: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 amino acids 

(B) TYPE: amino acid 

{C} STRANDEDNESS : single 
<D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Asn Leu Glu Tyr Leu Lys Ala Cys Gly Leu Asn Phe He Glu Thr Ser 

15 10 15 

Glu Asn Leu He Thr Leu Lys Asn Leu Lys Thr Pro Leu Lys Asp Glu 
20 25 30 

Val Phe Ser Phe He Asp Leu Glu Thr Thr Gly Ser Cys Pro He Lys 
35 40 45 

His Glu He Leu Glu He Gly Ala Val Gin Val Lys Gly Gly Glu He 
50 55 60 
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lie Asn Arg Phe Glu Thr Leu Val Lys Val Lys Ser Val Pro Asp Tyr 
65 70 75 80 

lie Ala Glu Leu Thr Gly He Thr Tyr Glu Asp Thr Leu Asn Ala Pro 
85 90 95 

Ser Ala His Glu Ala Leu Gin Glu Leu Arg Leu Phe Leu Gly Asn Ser 
100 105 HO 

Val Phe Val Ala His Asn Ala Asn Phe Asp Tyr Asn Phe Leu Gly Arg 
H5 120 125 

Tyr Phe Val Glu Lys Leu His Cys Pro Leu Leu Asn Leu Lys Leu Cys 
130 135 140 

Thr Leu Asp Leu Ser Lys Arg Ala He Leu Ser Met Arg Tyr Ser Leu 
145 150 155 160 

Ser Phe Leu Lys Glu Leu Leu Gly Phe Gly He Glu Val Ser His Arg 
165 170 175 

Ala Tyr Ala Asp Ala Leu Ala Ser Tyr Lys Leu Phe Glu lie Cys Leu 
180 185 190 

Leu Asn Leu Pro Ser Tyr He Lys Thr 
195 200 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
( D } TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

ATGGTGGAGC GGGTGGTGCG GACCCTTCTG GACGGGAGGT TCCTCCTGGA GGAGGGGGTG 60 

GGGC TTTGGG AGTGGCGCTA CCCCTTTCCC CTGGAGGGGG AGGCGGTGGT GGTCCTGGAC 120 

C TGGAG AC C A CGGGGCTTGC CGGCCTGGAC GAGGTGATTG AGGTGGGCCT CCTCCGCCTG 180 

GAGGGGGGGA GGCGCCTCCC CTTCCAGAGC CTCGTCCGGC CCCTCCCGCC CGCCGAAGCC 240 

CGTTCGTGGA ACCTCACCGG CATCCCCCGG GAGGCCCTGG AGGAGGCCCC CTCCCTGGAG 300 

GAGGTTCTGG AGAAGGCCTA CCCCCTCCGC GGCGACGCCA CCTTGGTGAT CCACAACGCC 360 

GCCTTTGACC TGGGCTTCCT CCGCCCGGCC TTGGAGGGCC TGGGCTACCG CCTGGAAAAC 420 

CCCGTGGTGG ACTCCCTGCG CTTGGCCAGA CGGGGCTTAC CAGGCCTTAG GCGC TACGGC 480 

CTGGACGCCC TCTCCGAGGT CCTGGAGCTT CCCCGAAGGA CCTGCCACCG GGCCCTCGAG 540 

GACGTGGAGC GCACCCTCGC CGTGGTGCAC GAGGTATACT ATATGCTTAC GTCCGGCCGT 600 
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CCCCGCACGC TTTGGGAACT CGGGAGGTAG 
(2) INFORMATION FOR SEQ ID NO~:95: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Val Glu Arg Val Val Arg Thr Leu Leu Asp Gly Arg Phe Leu Leu 
15 10 15 

Glu Glu Gly Val Gly Leu Trp Glu Trp Arg Tyr Pro Phe Pro Leu Glu 
20 25 30 

Gly Glu Ala Val Val Val Leu Asp Leu Glu Thr Thr Gly Leu Ala Gly 
35 40 45 

Leu Asp Glu Val lie Glu Val Gly Leu Leu Arg Leu Glu Gly Gly Arg 
50 55 60 

Arg Leu Pro Phe Gin Ser Leu Val Arg Pro Leu Pro Pro Ala Glu Ala 
65 70 75 80 

Arg Ser Trp Asn Leu Thr Gly lie Pro Arg Glu Ala Leu Glu Glu Ala 
85 90 * 95 

Pro Ser Leu Glu Glu Val Leu Glu Lys Ala Tyr Pro Leu Arg Gly Asp 
100 105 110 

Ala Thr Leu Val lie His Asn Ala Ala Phe Asp Leu Gly Phe Leu Arg 
115 120 125 

Pro Ala Leu Glu Gly Leu Gly Tyr Arg Leu Glu Asn Pro Val Val Asp 
130 135 140 

Ser Leu Arg Leu Ala Arg Arg Gly Leu Pro Gly Leu Arg Arg Tyr Gly 
145 150 155 160 

Leu Asp Ala Leu Ser Glu Val Leu Glu Leu Pro Arg Arg Thr Cys His 
165 170 175 

Arg Ala Leu Glu Asp Val Glu Arg Thr Leu Ala Val Val His Glu Val 
180 185 190 

Tyr Tyr Met Leu Thr Ser Gly Arg Pro Arg Thr Leu Trp Glu Leu Gly 
195 200 205 

Arg Glx 
210 

(2) INFORMATION FOR SEQ ID NO: 96: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 461 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Met Leu Glu Ala Ser Trp Glu Lys Val Gin Ser Ser Leu Lys Gin Asn 
15 10 15 

Leu Ser Lys Pro Ser Tyr Glu Thr Trp lie Arg Pro Thr Glu Phe Ser 
20 25 30 

Gly Phe Lys Asn Gly Glu Leu Thr Leu lie Ala Pro Asn Ser Phe Ser 
35 40 45 

Ser Ala Trp Leu Lys Asn Asn Tyr Ser Gin Thr lie Gin Glu Thr Ala 
50 55 60 

Glu Glu He Phe Gly Glu Pro Val Thr Val His Val Lys Val Lys Ala 
65 70 75 80 

Asn Ala Glu Ser Ser Asp Glu His Tyr Ser Ser Ala Pro He Thr Pro 
85 90 95 

Pro Leu Glu Ala Ser. Pro Gly Ser Val Asp Ser Ser Gly Ser Ser Leu 
100 105 110 

Arg Leu Ser Lys Lys Thr Leu Pro Leu Leu Asn Leu Arg Tyr Val Phe 
115 120 125 

Asn Arg Phe Val Val Gly Pro Asn Ser Arg Met Ala His Ala Ala Ala 
130 135 140 

Met Ala Val Ala Glu Ser Pro Gly Arg Glu Phe Asn Pro Leu Phe He 
145 150 155 160 

Cys Gly Gly Val Gly Leu Gly Lys Thr His Leu Met Gin Ala He Gly 
165 170 175 

His Tyr Arg Leu Glu He Asp Pro Gly Ala Lys Val Ser Tyr Val Ser 
180 185 190 

Thr Glu Thr Phe Thr Asn Asp Leu He Leu Ala He Arg Gin Asp Arg 
195 200 205 

Met Gin Ala Phe Arg Asp Arg Tyr Arg Ala Ala Asp Leu He Leu Val 
210 215 220 

Asp Asp He Gin Phe He Glu Gly Lys Glu Tyr Thr Gin Glu Glu Phe 
225 230 235 240 

Phe His Thr Phe Asn Ala Leu His Asp Ala Gly Ser Gin He Val Leu 
245 250 255 

Ala Ser Asp Arg Pro Pro Ser Gin He Pro Arg Leu Gin Glu Arg Leu 
260 265 270 



Met Ser Arg Phe Ser Met Gly Leu He Ala Asp Val Gin Ala Pro Asp 
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275 280 285 

Leu Glu Thr Arg Met Ala lie Leu Gin Lys Lys Ala Glu His Glu Arg 
290 295 300 

Val Gly Leu Pro Arg Asp Leu lie Gin Phe lie Ala Gly Arg Phe Thr 
305 310 315 320 

Ser Asn lie Arg Glu Leu Glu Gly Ala Leu Thr Arg Ala lie Ala Phe 
325 330 335 

Ala Ser He Thr Gly Leu Pro Met Thr Val Asp Ser He Ala Pro Met 
340 345 350 

Leu Asp Pro Asn Gly Gin Gly Val Glu Val Thr Pro Lys Gin Val Leu 
355 360 365 

Asp Lys Val Ala Glu Val Phe Lys Val Thr Pro Asp Glu Met Arg Ser 
370 375 380 

Ala Ser Arg Arg Arg Pro Val Ser Gin Ala Arg Gin Val Gly Met Tyr 
385 390 395 400 

Leu Met Arg Gin Gly Thr Asn Leu Ser Leu Pro Arg He Gly Asp Thr 
405 410 415 

Phe Gly Gly Lys Asp His Thr Thr Val Met Tyr Ala He Glu Gin Val 
420 425 430 

Glu Lys Lys Leu Ser Ser Asp Pro Gin He Ala Ser Gin Val Gin Lys 
435 440 445 

He Arg Asp Leu Leu Gin He Asp Ser Arg Arg Lys £rg 
450 455 460 

INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Met Val Ser Cys Glu Asn Leu Trp Gin Gin Ala Leu Ala He Leu Ala 
15 10 15 

Thr Gin Leu Thr Lys Pro Ala Phe Asp Thr Trp He Lys Ala Ser Val 

20 25 30 

Leu He Ser Leu Gly Asp Gly Val Ala Thr He Gin Val Glu Asn Gly 
35 40 45 

Phe Val Leu Asn His Leu Gin Lys Ser Tyr Gly Pro Leu Leu Met Glu 
50 55 60 

Val Leu Thr Asp Leu Thr Gly Gin Glu He Thr Val Lys Leu He Thr 
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65 70 75 80 

Asp Gly Leu Glu Pro His Ser Leu lie Gly Gin Glu Ser Ser Leu Pro 
85 90 95 

Met Glu Thr Thr Pro Lys Asn Ala Thr Ala Leu Asn Gly Lys Tyr Thr 
100 105 HO 

Phe Ser Arg Phe Val Val Gly Pro Thr Asn Arg Met Ala His Ala Ala 
115 120 125 

Ser Leu Ala Val Ala Glu Ser Pro Gly Arg Glu Phe Asn Pro Leu Phe 
130 135 140 

Leu Cys Gly Gly Val Gly Leu Gly Lys Thr His Leu Met Gin Ala lie 
145 150 155 160 

Ala His Tyr Arg Leu Glu Met Tyr Pro Asn Ala Lys Val Tyr Tyr Val 
165 170 175 

Ser Thr Glu Arg Phe Thr Asn Asp Leu lie Thr Ala lie Arg Gin Asp 
180 185 190 

Asn Met Glu Asp Phe Arg Ser Tyr Tyr Arg Ser Ala Asp Phe Leu Leu 
195 200 205 

lie Asp Asp lie Gin Phe lie Lys Gly Lys Glu Tyr Thr Gin Glu Glu 
210 215 220 

Phe Phe His Thr Phe Asn Ser Leu His Glu Ala Gly Lys Gin Val Val 
225 230 235 240 

Val Ala Ser Asp Arg Ala Pro Gin Arg lie Pro Gly Leu Gin Asp Arg 
245 250 * 255 

Leu lie Ser Arg Phe Ser Met Gly Leu lie Ala Asp lie Gin Val Pro 
260 265 270 

Asp Leu Glu Thr Arg Met Ala lie Leu Gin Lys Lys Ala Glu Tyr Asp 
275 280 285 

Arg lie Arg Leu Pro Lys Glu Val lie Glu Tyr lie Ala Ser His Tyr 
290 295 300 

Thr Ser Asn lie Arg Glu Leu Glu Gly Ala Leu lie Arg Ala lie Ala 
305 310 315 320 

Tyr Thr Ser Leu Ser Asn Val Ala Met Thr Val Glu Asn lie Ala Pro 
325 330 335 

Val Leu Asn Pro Pro Val Glu Lys Val Ala Ala Ala Pro Glu Thr lie 
340 345 350 

lie Thr lie Val Ala Gin His Tyr Gin Leu Lys Val Glu Glu Leu Leu 
355 360 365 

Ser Asn Ser Arg Arg Arg Glu Val Ser Leu Ala Arg Gin Val Gly Met 
370 375 380 

Tyr Leu Met Arg Gin His Thr Asp Leu Ser Leu Pro Arg lie Gly Glu 
385 390 395 400 



Ala Phe Gly Gly Lys Asp His Thr Thr Val Met Tyr Ser Cys Asp Lys 



405 

lie Thr Gin Leu Gin Gin Lys Asp 
420 

Ser Leu Ser His Arg lie Asn lie 
435 440 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



410 415 

Trp Glu Thr Ser Gin Thr Leu Thr 
425 430 

Ala Gly Gin Ala Pro Glu Ser 
445 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:98: 

Met Glu Asn He Leu Asp Leu Trp Asn Gin Ala Leu Ala Gin He Glu 
15 10 15 

Lys Lys Leu Ser Lys Pro Ser Phe Glu Thr Trp Met Lys Ser Thr Lys 
20 25 30 

Ala His Ser Leu Gin Gly Asp Thr Leu Thr He Thr Ala Pro Asn Glu 
35 40 45 

Phe Ala Arg Asp Trp Leu Glu Ser Arg Tyr Leu His I^eu He Ala Asp 
50 ~ 55 60 

Thr He Tyr Glu Leu Thr Gly Glu Glu Leu Ser He Lys Phe Val He 
65 70 75 80 

Pro Gin Asn Gin Asp Val Glu Asp Phe Met Pro Lys Pro Gin Val Lys 
85 90 95 

Lys Ala Val Lys Glu Asp Thr Ser Asp Phe Pro Gin Asn Met Leu Asn 
100 105 HO 

Pro Lys Tyr Thr Phe Asp Thr Phe Val He Gly Ser Gly Asn Arg Phe 
115 120 125 

Ala His Ala Ala Ser Leu Ala Val Ala Glu Ala Pro Ala Lys Ala Tyr 
130 135 140 

Asn Pro Leu Phe He Tyr Gly Gly Val Gly Leu Gly Lys Thr His Leu 
145 150 155 160 

Met His Ala He Gly His Tyr Val He Asp His Asn Pro Ser Ala Lys 
165 170 175 

Val Val Tyr Leu Ser Ser Glu Lys Phe Thr Asn Glu Phe He Asn Ser 
180 185 190 

He Arg Asp Asn Lys Ala Val Asp Phe Arg Asn Arg Tyr Arg Asn Val 
195 200 205 

Asp Val Leu Leu He Asp Asp He Gin Phe Leu Ala Gly Lys Glu Gin 
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210 215 220 

Thr Gin Glu Glu Phe Phe His Thr Phe Asn Thr Leu His Glu Glu Ser 
225 230 235 240 

Lys Gin lie Val lie Ser Ser Asp Arg Pro Pro Lys Glu lie Pro Thr 
245 250 255 

Leu Glu Asp Arg Leu Arg Ser Arg Phe Glu Trp Gly Leu lie Thr Asp 
260 265 270 

lie Thr Pro Pro Asp Leu Glu Thr Arg lie Ala lie Leu Arg Lys Lys 
275 280 285 

Ala Lys Ala Glu Gly Leu Asp lie Pro Asn Glu Val Met Leu Tyr lie 
290 295 300 

Ala Asn Gin lie Asp Ser Asn lie Arg Glu Leu Glu Gly Ala Leu lie 
305 310 315 320 

Arg Val Val Ala Tyr Ser Ser Leu He Asn Lys Asp He Asn Ala Asp 
325 330 335 

Leu Ala Ala Glu Ala Leu Lys Asp lie He Pro Ser Ser Lys Pro Lys 
340 345 350 

Val He Thr He Lys Glu He Gin Arg Val Val Gly Gin Gin Phe Asn 
355 360 365 

He Lys Leu Glu Asp Phe Lys Ala Lys Lys Arg Thr Lys Ser Val Ala 
370 375 380 

Phe Pro Arg Gin He Ala Met Tyr Leu Ser Arg Glu ijet Thr Asp Ser 
385 390 395 400 

Ser Leu Pro Lys He Gly Glu Glu Phe Gly Gly Arg Asp His Thr Thr 
405 410 415 

Val lie His Ala His Glu Lys He Ser Lys Leu Leu Ala Asp Asp Glu 
420 425 430 

Gin Leu Gin Gin His Val Lys Glu He Lys Glu Gin Leu Lys 
435 440 445 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Met Thr Asp Asp Pro Gly Ser Gly Phe Thr Thr Val Trp Asn Ala Val 

15 10 15 

Val Ser Glu Leu Asn Gly Asp Pro Lys Val Asp Asp Gly Pro Ser Ser 
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20 



25 



30 



Asp Ala Asn Leu Ser Ala Pro Leu Thr Pro Gin Gin Arg Ala Trp Leu 
35 40 45 

Asn Leu Val Gin Pro Leu Thr lie Val Glu Gly Phe Ala Leu Leu Ser 
50 55 60 

Val Pro Ser Ser Phe Val Gin Asn Glu lie Glu Arg His Leu Arg Ala , 
65 70 75 80 

Pro lie Thr Asp Ala Leu Ser Arg Arg Leu Gly His Gin lie Gin Leu 
85 90 95 

Gly Val Arg lie Ala Pro Pro Ala Thr Asp Glu Ala Asp Asp Thr Thr 
100 105 110 

Val Pro Pro Ser Glu Asn Pro Ala Thr Thr Ser Pro Asp Thr Thr Thr 
115 120 125 

Asp Asn Asp Glu lie Asp Asp Ser Ala Ala Ala Arg Gly Asp Asn Gin 
130 135 140 

His Ser Trp Pro Ser Tyr Phe Thr Glu Arg Pro His Asn Thr Asp Ser 
145 150 155 160 

Ala Thr Ala Gly Val Thr Ser Leu Asn Arg Arg Tyr Thr Phe Asp Thr 
165 170 175 

Phe Val lie Gly Ala^ Ser Asn Arg Phe Ala His Ala Ala Ala Leu Ala 
180 185 190 

lie Ala Glu Ala Pro Ala Arg Ala Tyr Asn Pro Leu E^he lie Trp Gly 
195 200 205 

Glu Ser Gly Leu Gly Lys Thr His Leu Leu His Ala Ala Gly Asn Tyr 
210 215 220 

Ala Gin Arg Leu Phe Pro Gly Met Arg Val Lys Tyr Val Ser Thr Glu 
225 230 235 240 

Glu Phe Thr Asn Asp Phe lie Asn Ser Leu Arg Asp Asp Arg Lys Val 
245 250 255 

Ala Phe Lys Arg Ser Tyr Arg Asp Val Asp Val Leu Leu Val Asp Asp 
260 265 270 

lie Gin Phe lie Glu Gly Lys Glu Gly lie Gin Glu Glu Phe Phe His 
275 280 285 

Thr Phe Asn Thr Leu His Asn Ala Asn Lys Gin He Val He Ser Ser 
290 295 300 

Asp Arg Pro Pro Lys Gin Leu Ala Thr Leu Glu Asp Arg Leu Arg Thr 
305 310 315 320 

Arg Phe Glu Trp Gly Leu He Thr Asp Val Gin Pro Pro Glu Leu Glu 
325 330 335 

Thr Arg He Ala He Leu Arg Lys Lys Ala Gin Met Glu Arg Leu Ala 
340 345 350 

Val Pro Asp Asp Val Leu Glu Leu He Ala Ser Ser He Glu Arg Asn 
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355 360 365 

lie Arg Glu Leu Glu Gly Ala Leu lie Arg Val Thr Ala Phe Ala Ser 
370 375 380 

Leu Asn Lys Thr Pro He Asp Lys Ala Leu Ala Glu He Val Leu Arg 
385 390 395 400 

Asp Leu He Ala Asp Ala Asn Thr Met Gin He Ser Ala Ala Thr He 
405 410 415 

Met Ala Ala Thr Ala Glu Tyr Phe Asp Thr Thr Val Glu Glu Leu Arg 
420 425 430 

Gly Pro Gly Lys Thr Arg Ala Leu Ala Gin Ser Arg Gin He Ala Met 
435 440 445 

Tyr Leu Cys Arg Glu Leu Thr Asp Leu Ser Leu Pro Lys He Gly Gin 
450 455 460 

Ala Phe Gly Arg Asp His Thr Thr Val Met Tyr Ala Gin Arg Lys He 
465 470 475 480 

Leu Ser Glu Met Ala Glu Arg Arg Glu Val Phe Asp His Val Lys Glu 
485 490 495 

Leu Thr Thr Arg He Arg Gin Arg Ser Lys Arg 
500 505 

(2) INFORMATION FOR SEQ ' ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Ser His Glu Ala Val Trp Gin His Val Leu Glu His He Arg Arg 
15 10 15 

Ser He Thr Glu Val Glu Phe His Thr Trp Phe Glu Arg He Arg Pro 
20 25 30 

Leu Gly He Arg Asp Gly Val Leu Glu Leu Ala Val Pro Thr Ser Phe 
35 40 45 

Ala Leu Asp Trp He Arg Arg His Tyr Ala Gly Leu He Gin Glu Gly 
50 55 60 

Pro Arg Leu Leu Gly Ala Gin Ala Pro Arg Phe Glu Leu Arg Val Val 
65 70 75 80 

Pro Gly Val Val Val Gin Glu Asp He Phe Gin Pro Pro Pro Ser Pro 
85 90 95 

Pro Ala Gin Ala Gin Pro Glu Asp Thr Phe Lys Thr Ser Trp Trp Gly 
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100 



105 



110 



Pro Thr Thr Pro Trp Pro His Gly Gly Ala Val Ala Val Ala Glu Ser 
115 120 125 

Pro Gly Arg Ala Tyr Asn Pro Leu Phe lie Tyr Gly Gly Arg Gly Leu 
130 135 140 

Gly Lys Thr Tyr Leu Met His Ala Val Gly Pro Leu Arg Ala Lys Arg 
!45 150 155 160 

Phe Pro His Met Arg Leu Glu Tyr Val Ser Thr Glu Thr Phe Thr Asn 
165 170 175 

Glu Leu lie Asn Arg Pro Ser Ala Arg Asp Arg Met Thr Glu Phe Arg 
180 185 190 

Glu Arg Tyr Arg Ser Val Asp Leu Leu Leu Val Asp Asp Val Gin Phe 
195 200 205 

lie Ala Gly Lys Glu Arg Thr Gin Glu Glu Phe Phe His Thr Phe Asn 
210 215 220 

Ala Leu Tyr Glu Ala His Lys Gin lie lie Leu Ser Ser Asp Arg Pro 
225 230 235 240 

Pro Lys Asp lie Leu Thr Leu Glu Ala Arg Leu Arg Ser Arg Phe Glu 
245 250 255 

Trp Gly Leu lie Thr Asp Asn Pro Ala Pro Asp Leu Glu Thr Arg lie 
260 265 270 

Ala lie Leu Lys Met Asn Ala Ser Ser Gly Pro Glu Asp Pro Glu Asp 
275 280 285 

Ala Leu Glu Tyr lie Ala Arg Gin Val Thr Ser Asn lie Arg Glu Trp 
290 295 300 

Glu Gly Ala Leu Met Arg Ala Ser Pro Phe Ala Ser Leu Asn Gly Val 
305 310 315 320 

Glu Leu Thr Arg Ala Val Ala Ala Lys Ala Leu Arg His Leu Arg Pro 
325 330 335 

Arg Glu Leu Glu Ala Asp Pro Leu Glu lie lie Arg Lys Ala Ala Gly 
340 345 350 

Pro Val Arg Pro Glu Thr Pro Gly Gly Ala His Gly Glu Arg Arg Lys 
355 360 365 

Lys Glu Val Val Leu Pro Arg Gin Leu Ala Met Tyr Leu Val Arg Glu 
370 375 380 

Leu Thr Pro Ala Ser Leu Pro Glu lie Gly Gin Leu Phe Gly Gly Arg 
385 390 395 400 

Asp His Thr Thr Val Arg Tyr Ala He Gin Lys Val Gin Glu Leu Ala 
405 410 415 

Gly Lys Pro Asp Arg Glu Val Gin Gly Leu Leu Arg Thr Leu Arg Glu 
420 425 430 

Ala Cys Thr Asp Pro Val Asp Asn Leu Trp He Thr Cys Gly 
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435 



440 



445 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Met Ser Leu Ser Leu Trp Gin Gin Cys Leu Ala Arg Leu Gin Asp Glu 
15 10 15 

Leu Pro Ala Thr Glu Phe Ser Met Trp lie Arg Pro Leu Gin Ala Glu 
20 25 30 

Leu Ser Asp Asn Thr Leu Ala Leu Tyr Ala Pro Asn Arg Phe Val Leu 
35 40 45 

Asp Trp Val Arg Asp Lys Tyr Leu Asn Asn lie Asn Gly Leu Leu Thr 
50 55 60 

Ser Phe Cys Gly Ala Asp Ala Pro Gin Leu Arg Phe Glu Val Gly Thr 
65 70 75 80 

Lys Pro Val Thr Gin Thr Pro Gin Ala Ala Val Thr §er Asn Val Ala 
85 90 95 

Ala Pro Ala Gin Val Ala Gin Thr Gin Pro Gin Arg Ala Ala Pro Ser 
100 105 110 

Thr Arg Ser Gly Trp Asp Asn Val Pro Ala Pro Ala Glu Pro Thr Tyr 
115 120 125 

Arg Ser Asn Val Asn Val Lys His Thr Phe Asp Asn Phe Val Glu Gly 
130 135 140 

Lys Ser Asn Gin Leu Ala Arg Ala Ala Ala Arg Gin Val Ala Asp Asn 
145 150 155 160 

Pro Gly Gly Ala Tyr Asn Pro Leu Phe Leu Tyr Gly Gly Thr Gly Leu 
165 170 175 

Gly Lys Thr His Leu Leu His Ala Val Gly Asn Gly lie Met Ala Arg 
180 185 190 

Lys Pro Asn Ala Lys Val Val Tyr Met His Ser Glu Arg Phe Val Gin 
195 200 205 

Asp Met Val Lys Ala Leu Gin Asn Asn Ala He Glu Glu Phe Lys Arg 
210 215 220 

Tyr Tyr Arg Ser Val Asp Ala Leu Leu He Asp Asp He Gin Phe Phe 
225 230 235 240 



Ala Asn Lys Glu Arg Ser Gin Glu Glu Phe Phe His Thr Phe Asn Ala 



r r 



245 250 255 

Leu Leu Glu Gly Asn Gin Gin lie ~Ile Leu Thr Ser Asp Arg Tyr Pro 
260 265 270 

Lys Glu lie Asn Gly Val Glu Asp Arg Leu Lys Ser Arg Phe Gly Trp 
275 280 285 

Gly Leu Thr Val Ala lie Glu Pro Pro Glu Leu Glu Thr Arg Val Ala 
290 295 300 

He Leu Met Lys Lys Ala Asp Glu Asn Asp He Arg Leu Pro Gly Glu 
305 310 315 320 

Val Ala Phe Phe lie Ala Lys Arg Leu Arg Ser Asn Val Arg Glu Leu 
325 330 335 

Glu Gly Ala Leu Asn Arg Val He Ala Asn Ala Asn Phe Thr Gly Arg 
340 345 350 

Ala He Thr He Asp Phe Val Arg Glu Ala Leu Arg Asp Leu Leu Ala 
355 360 365 

Leu Gin Glu Lys Leu Val Thr He Asp Asn He Gin Lys Thr Val Ala 
370 375 380 

Glu Tyr Tyr Lys He Lys Val Ala Asp Leu Leu Ser Lys Arg Arg Ser 
385 390 395 400 

Arg Ser Val Ala Arg Pro Arg Gin Met Ala Met Ala Leu Ala Lys Glu 
405 410 415 

Leu Thr Asn His Ser Leu Pro Glu He Gly Asp Ala Phe Gly Gly Arg 
420 425 * 430 

Asp His Thr Thr Val Leu His Ala Cys Arg Lys He Glu Gin Leu Arg 
435 440 445 

Glu Glu Ser His Asp He Lys Glu Asp Phe Ser Asn Leu He Arg Thr 
450 455 460 

Leu Ser Ser 
465 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 440 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: 

Met Lys Glu Arg lie Leu Gin Glu He Lys Thr Arg Val Asn Arg Lys 
15 10 15 

Ser Trp Glu Leu Trp Phe Ser Ser Phe Asp Val Lys Ser He Glu Gly 



r 
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20 



25 



30 



Asn Lys Val Val Phe Ser Val Gly Asn Leu Phe lie Lys Glu Trp Leu 
35 40 45 

Glu Lys Lys Tyr Tyr Ser Val Leu Ser Lys Ala Val Lys Val Val Leu 
50 55 60 

Gly Asn Asp Ala Thr Phe Glu lie Thr Tyr Glu Ala Phe Glu Pro His 
65 70 75 80 

Ser Ser Tyr Ser Glu Pro Leu Val Lys Lys Arg Ala Val Leu Leu Thr 
85 90 95 

Pro Leu Asn Pro Asp Tyr Thr Phe Glu Asn Phe Val Val Gly Pro Gly 
100 105 110 

Asn Ser Phe Ala Tyr His Ala Ala Leu Glu Val Ala Lys His Pro Gly 
115 120 125 

Arg Tyr Asn Pro Leu Phe lie Tyr Gly Gly Val Gly Leu Gly Lys Thr 
130 135 140 

His Leu Leu Gin Ser lie Gly Asn Tyr Val Val Gin Asn Glu Pro Asp 
145 150 155 160 

Leu Arg Val Met Tyr He Thr Ser Glu Lys Phe Leu Asn Asp Leu Val 
165 170 175 

Asp Ser Met Lys Glu* Gly Lys Leu Asn Glu Phe Arg Glu Lys Tyr Arg 
180 185 190 

Lys Lys Val Asp He Leu Leu He Asp Asp Val Gin P.he Leu He Gly 
195 200 205 

Lys Thr Gly Val Gin Thr Glu Leu Phe His Thr Phe Asn Glu Leu His 
210 215 220 

Asp Ser Gly Lys Gin He Val He Cys Ser Asp Arg Glu Pro Gin Lys 
225 230 235 240 

Leu Ser Glu Phe Gin Asp Arg Leu Val Ser Arg Phe Gin Met Gly Leu 
245 250 255 

Val Ala Lys Leu Glu Pro Pro Asp Glu Glu Thr Arg Lys Ser He Ala 
260 265 270 

Arg Lys Met Leu Glu He Glu His Gly Glu Leu Pro Glu Glu Val Leu 
275 280 285 

Asn Phe Val Ala Glu Asn Val Asp Asp Asn Leu Arg Arg Leu Arg Gly 
290 295 300 

Ala He He Lys Leu Leu Val Tyr Lys Glu Thr Thr Gly Lys Glu Val 
305 310 315 320 

Asp Leu Lys Glu Ala He Leu Leu Leu Lys Asp Phe He Lys Pro Asn 
325 330 335 

Arg Val Lys Ala Met Asp Pro He Asp Glu Leu He Glu He Val Ala 
340 345 350 

Lys Val Thr Gly Val Pro Arg Glu Glu He Leu Ser Asn Ser Arg Asn 



r r 



355 360 365 

Val Lys Ala Leu Thr Ala Arg Arg lie Gly Met Tyr Val Ala Lys Asn 
370 375 380 

Tyr Leu Lys Ser Ser Leu Arg Thr lie Ala Glu Lys Phe Asn Arg Ser 
385 390 395 400 

His Pro Val Val Val Asp Ser Val Lys Lys Val Lys Asp Ser Leu Leu 
405 410 415 

Lys Gly Asn Lys Gin Leu Lys Ala Leu lie Asp Glu Val lie Gly Glu 
420 425 430 

lie Ser Arg Arg Ala Leu Ser Gly 
435 440 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 457 amino acids 
{B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: 

Met Asp Thr Asn Asn Asn lie Glu Lys Glu lie Leu Ala Leu Val Lys 
15 10 15 

Gin Asn Pro Lys Val Ser Leu lie Glu Tyr Glu Asn Tyr Phe Ser Gin 
20 25 30 

Leu Lys Tyr Asn Pro Asn Ala Ser Lys Ser Asp lie Ala Phe Phe Tyr 
35 40 45 

Ala Pro Asn Gin Val Leu Cys Thr Thr lie Thr Ala Lys Tyr Gly Ala 
50 55 60 

Leu Leu Lys Glu lie Leu Ser Gin Asn Lys Val Gly Met His Leu Ala 
65 70 75 80 

His Ser Val Asp Val Arg lie Glu Val Ala Pro Lys lie Gin lie Asn 
85 90 95 

Ala Gin Ser Asn lie Asn Tyr Lys Ala lie Lys Thr Ser Val Lys Asp 
100 105 110 

Ser Tyr Thr Phe Glu Asn Phe Val Val Gly Ser Cys Asn Asn Thr Val 
115 120 125 

Tyr Glu lie Ala Lys Lys Val Ala Gin Ser Asp Thr Pro Pro Tyr Asn 
130 135 140 

Pro Val Leu Phe Tyr Gly Gly Thr Gly Leu Gly Lys Thr His He Leu 
145 150 155 160 



Asn Ala He Gly Asn His Ala Leu Glu Lys His Lys Lys Val Val Leu 
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165 



170 



175 



Val Thr Ser Glu Asp Phe Leu Thr Asp Phe Leu Lys His Leu Asp Asn 
180 185 190 

Lys Thr Met Asp Ser Phe Lys Ala Lys Tyr Arg His Cys Asp Phe Phe 
195 200 205 

Leu Leu Asp Asp Ala Gin Phe Leu Gin Gly Lys Pro Lys Leu Glu Glu 
210 215 220 

Glu Phe Phe His Thr Phe Asn Glu Leu His Ala Asn Ser Lys Gin lie 
225 230 235 240 

Val Leu lie Ser Asp Arg Ser Pro Lys Asn lie Ala Gly Leu Glu Asp 
245 250 255 

Arg Leu Lys Ser Arg Phe Glu Trp Gly lie Thr Ala Lys Val Met Pro 
260 265 270 

Pro Asp Leu Glu Thr Lys Leu Ser lie Val Lys Gin Lys Cys Gin Leu 
275 280 285 

Asn Gin lie Thr Leu Pro Glu Glu Val Met Glu Tyr lie Ala Gin His 
290 295 300 

lie Ser Asp Asn lie Arg Gin Met Glu Gly Ala lie lie Lys lie Ser 
305 310 315 320 

Val Asn Ala Asn Leu Met Asn Ala Ser lie Asp Leu Asn Leu Ala Lys 
325 330 335 

Thr Val Leu Glu Asp Leu Gin Lys Asp His Ala Glu Gly Ser Ser Leu 
340 345 350 

Glu Asn lie Leu Leu Ala Val Ala Gin Ser Leu Asn Leu Lys Ser Ser 
355 360 365 

Glu lie Lys Val Ser Ser Arg Gin Lys Asn Val Ala Leu Ala Arg Lys 
370 375 380 

Leu Val Val Tyr Phe Ala Arg Leu Tyr Thr Pro Asn Pro Thr Leu Ser 
385 390 395 400 

Leu Ala Gin Phe Leu Asp Leu Lys Asp His Ser Ser lie Ser Lys Met 
405 410 415 

Tyr Ser Gly Val Lys Lys Met Leu Glu Glu Glu Lys Ser Pro Phe Val 
420 425 430 

Leu Ser Leu Arg Glu Glu lie Lys Asn Arg Leu Asn Glu Leu Asn Asp 
435 440 445 

Lys Lys Thr Ala Phe Asn Ser Ser Glu 
450 455 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 1305 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 



! c 

<ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 104 : 



GTGTCGCACG 


AGGCCGTCTG 


GCAACACGTT 


P TdCZ A P,P A p a 


1 L.V-UV-(*,QjL. A(j 


CATCACCGAG 


60 


GTGGAGTTCC 


ACACCTGGTT 


TGAAAGGATP 


PPPPPPTTPP 


an flTrpfpp a 
\jVjA l v^v-kjvjvjA 


C GGGGTGC TG 


120 


GAGCTCGCCG 


TGCCCACCTC 


w AAA VJV^VdV. A VJ 




oVj C {j \~ C A C T A 


CGCCGGCCTC 


180 


ATCCAGGAGG 


GCCCTCGGCT 






vjtjl 1 ICjAGCT 


CCGGGTGGTG 


240 


CCCGGGGTCG 


TAGTCCAGGA 


GGACATPTTP 


P AP,PPPPPPP 


L. Avj U v_ t_ C U C 


GGCCCAAGCT 


300 


CAACCCGAAG 


ATACCTTTAA 


AACTTCGTGG 






CaCCCCACGGC 


360 


GGCGCCGTGG 


CCGTGGCCGA 


GTCCCCCGGP 






CAT C T AC GGG 


420 


GGCCGTGGCC 


TGGGAAAGAC 


CTAPPTGATP^ 


P A HCCHCrpClf 


bLLLAL I CCG 


TGCGAAGCGC 


480 


TTCCCCCACA 


TGAGATTAGA 


G T ACGT TTPP 


Vj^jAHlAL 1 1 


rn/^ A A A 0<^~» 7\ 
1 LALLAALCjA 


GCTCATCAAC 


540 


CGGCCATCCG 


CGAGGGACCG 


vJ-fi A \JjTV^Vjv?.rt\J 




GGTACCGCTC 


CGTGGACCTC 


600 


CTGCTGGTGG 


ACGACGTCCA 


HTTP A TPfifP 




GCACCCAGGA 


GGAGTTTTTC 


660 


CACACCTTCA 


ACGCCCTTTA 


CGAGnCPP A p 


AA<o\_ AvjA I LA 


IuCTCTCCTC 


CGACCGGCCG 


720 


CCCAAGGACA 


TCCTCACCCT 


GGAGGCGCGP 


PTPPPPanpP 


LtU 111 (jrAQj 1 C* 


GGGCCTGATC 


780 


ACCGACAATC 


CAGCCCCCGA 


CCTGGAAACC 


A TPP1PP Zi 


JL v,U i oAACjA I 


GAACGC C AGC 


840 


AGCGGGCCTG 


AGGATCCCGA 


GGACGCCCTG 




C L. U QjVjL. AGGT 


CACCTCCAAC 


900 


ATCCGGGAGT 


GGGAAGGGGC 


V— *w A \_ XT A VjV_ 


vjL A 1 CCjuL. i r 


TCGCCTCCCT 


CAACGGCGTT 


960 


GAGCTGACCC 


GCGCCGTGGC 


GGCCAAGGCT 


CTCCGACATC 


TTCGCCCCAG 


GGAGC TGGAG 


1020 


GCGGACCCCT 


TGGAGATCAT 


CCGCAAAGCG 


GCGGGACCAG 


TTCGGCCTGA 


AACCCCGGGA 


1080 


GGAGCTCACG 


GGGAGCGCCG 


CAAGAAGGAG 


GTGGTCCTCC 


CCCGGCAGCT 


CGCCATGTAC 


1140 


CTGGTGCGGG 


AGCTCACCCC 


GGCCTCCCTG 


CCCGAGATCG 


ACCAGCTCAA 


CGACGACCGG 


1200 


GACCACACCA 


CGGTCCTCTA 


CGCCATCCAG 


AAGGTCCAGG 


AGCTCGCGGA 


AAGCGACCGG 


1260 


GAGGTGCAGG 


GCCTCCTCCG 


CACCCTCCGG 


GAGGCGTGCA 


CATGA 




1305 


(2) INFORMATION FOR SEQ ID NO: 105: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TO POLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 



r 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Val Ser His Glu Ala Val Trp Gin His Val Leu Glu His lie Arg Arg 
15 10 15 

Ser lie Thr Glu Val Glu Phe His Thr Trp Phe Glu Arg lie Arg Pro 
20 25 30 

Leu Gly lie Arg Asp Gly Val Leu Glu Leu Ala Val Pro Thr Ser Phe 
35 40 45 

Ala Leu Asp Trp lie Arg Arg His Tyr Ala Gly Leu lie Gin Glu Gly 
50 55 60 

Pro Arg Leu Leu Gly Ala Gin Ala Pro Arg Phe Glu Leu Arg Val Val 
65 70 75 80 

Pro Gly Val Val Val Gin Glu Asp lie Phe Gin Pro Pro Pro Ser Pro 
85 90 95 

Pro Ala Gin Ala Gin Pro Glu Asp Thr Phe Lys Thr Ser Trp Trp Gly 
100 105 110 

Pro Thr Thr Pro Trp Pro His Gly Gly Ala Val Ala Val Ala Glu Ser 
115 120 125 

Pro Gly Arg Ala Tyr Asn Pro Leu Phe lie Tyr Gly Gly Arg Gly Leu 
130 * 135 140 

Gly Lys Thr Tyr Leu Met His Ala Val Gly Pro Leu Arg Ala Lys Arg 
145 150 155 . 160 

Phe Pro His Met Arg Leu Glu Tyr Val Ser Thr Glu Thr Phe Thr Asn 
165 170 175 

Glu Leu lie Asn Arg Pro Ser Ala Arg Asp Arg Met Thr Glu Phe Arg 
180 185 190 

Glu Arg Tyr Arg Ser Vai Asp Leu Leu Leu Val Asp Asp Val Gin Phe 
195 200 205 

lie Ala Gly Lys Glu Arg Thr Gin Glu Glu Phe Phe His Thr Phe Asn 
210 215 220 

Ala Leu Tyr Glu Ala His Lys Gin lie lie Leu Ser Ser Asp Arg Pro 
225 230 235 240 

Pro Lys Asp lie Leu Thr Leu Glu Ala Arg Leu Arg Ser Arg Phe Glu 
245 250 255 

Trp Gly Leu lie Thr Asp Asn Pro Ala Pro Asp Leu Glu Thr Arg lie 
260 265 270 

Ala lie Leu Lys Met Asn Ala Ser Ser Gly Pro Glu Asp Pro Glu Asp 
275 280 285 

Ala Leu Glu Tyr lie Ala Arg Gin Val Thr Ser Asn lie Arg Glu Trp 
290 295 300 



Glu Gly Ala Leu Met Arg Ala Ser Pro Phe Ala Ser Leu Asn Gly Val 
305 310 315 320 



r 



Glu Leu Thr Arg 



Arg Glu Leu Glu 
340 

Pro Val Arg Pro 
355 

Lys Glu Val Val 
370 

Leu Thr Pro Ala 
385 

Asp His Thr Thr 



Glu Ser Asp Arg 
420 

Cys Thr 



Ala Val Ala Ala 
325 

Ala Asp Pro Leu 



Glu Thr Pro Gly 
360 

Leu Pro Arg Gin 
375 

Ser Leu Pro Glu 
390 

Val Leu Tyr Ala 
405 

Glu Val Gin Gly 



Lys. Ala Leu Arg 
330 

Glu lie lie Arg 
345 

Gly Ala His Gly 



Leu Ala Met Tyr 
380 

lie Asp Gin Leu 
395 

lie Gin Lys Val 
410 

Leu Leu Arg Thr 
425 



His Leu Arg Pro 
335 

Lys Ala Ala Gly 
350 

Glu Arg Arg Lys 
365 

Leu Val Arg Glu 



Asn Asp Asp Arg 
400 

Gin Glu Leu Ala 
415 

Leu Arg Glu Ala 
430 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

ATGAACATAA CGGTTCCCAA AAAACTCCTC TCGGACCAGC TTTCCCTCCT GGAGCGCATC 60 
GTCCCCTCTA GAAGCGCCAA CCCCCTCTAC ACCTACCTGG GGCTTTACGC CGAGGAAGGG , 120 

GCCTTGATCC TCTTCGGGAC CAACGGGGAG GTGGACCTCG AGGTCCGCCT CCCCGCCGAG 180 

GCCCAAAGCC TTCCCCGGGT GCTCGTCCCC GCCCAGCCCT TCTTCCAGCT GGTGCGGAGC 240 

CTTCCTGGGG ACCTCGTGGC CCTCGGCCTC GCCTCGGAGC CGGGCCAGGG GGGGCAGCTG 300 

GAGCTCTCCT CCGGGCGTTT CCGCACCCGG CTCAGCCTGG CCCCTGCCGA GGGCTACCCC 3 60 

GAGCTTCTGG TGCCCGAGGG GGAGGACAAG GGGGCCTTCC CCCTCCGGAC GCGGATGCCC 420 

TCCGGGGAGC TCGTCAAGGC CTTGACCCAC GTGCGCTACG CCGCGAGCAA CGAGGAGTAC 480 

CGGGCCATCT TCCGCGGGGT GCAGCTGGAG TTCTCCCCCC AGGGCTTCCG GGCGGTGGCC 540 

TCCGACGGGT ACCGCCTCGC CCTCTACGAC CTGCCCCTGC CCCAAGGGTT CCAGGCCAAG 600 

GCCGTGGTCC CCGCCCGGAG CGTGGACGAG ATGGTGCGGG TCCTGAAGGG GGCGGACGGG 660 

GCCGAGGCCG TCCTCGCCCT GGGCGAGGGG GTGTTGGCCC TGGCCCTCGA GGGCGGAAGC 720 



r 
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GGGGTCCGGA TGGCCCTCCG CCTCATGGAA GGGGAGTTCC CCGACTACCA GAGGGTCATC 7 80 

CCCCAGGAGT TCGCCCTCAA GGTCCAGGTG GAGGGGGAGG CCCTCAGGGA GGCGGTGCGC 840 

CGGGTGAGCG TCCTCTCCGA CCGGCAGAAC CACCGGGTGG ACCTCCTTTT GGAGGAAGGC 900 

CGGATCCTCC TCTCCGCCGA GGGGGACTAC GGCAAGGGGC AGGAGGAGGT GCCCGCCCAG 960 

GTGGAGGGGC CGGACATGGC CGTGGCCTAC AACGCCCGCT ACCTCCTCGA GGCCCTCGCC 102 0 

CCCGTGGGGG ACCGGGCCCA CCTGGGCATC TCCGGGCCCA CGAGCCCGAG CCTCATCTGG 1080 

GGGGACGGGG AGGGGTACCG GGCGGTGGTG GTGCCCCTCA GGGTCTAG 1128 
{2} INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Asn lie Thr Val Pro Lys Lys Leu Leu Ser Asp Gin Leu Ser Leu 
15 10 15 

Leu Glu Arg lie Val Pro Ser Arg Ser Ala Asn Pro Leu Tyr Thr Tyr 
20 25 30 

Leu Gly Leu Tyr Ala Glu Glu Gly Ala Leu lie Leu Phe Gly Thr Asn 
35 40 45 

Gly Glu Val Asp Leu Glu Val Arg Leu Pro Ala Glu Ala Gin Ser Leu 
50 55 60 

Pro Arg Val Leu Val Pro Ala Gin Pro Phe Phe Gin Leu Val Arg Ser. 
65 70 75 80 

Leu Pro Gly Asp Leu Val Ala Leu Gly Leu Ala Ser Glu Pro Gly Gin 
85 90 95 

Gly Gly Gin Leu Glu Leu Ser Ser Gly Arg Phe Arg Thr Arg Leu Ser 
100 105 110 

Leu Ala Pro Ala Glu Gly Tyr Pro Glu Leu Leu Val Pro Glu Gly Glu 
115 120 125 

Asp Lys Gly Ala Phe Pro Leu Arg Thr Arg Met Pro Ser Gly Glu Leu 
130 135 140 

Val Lys Ala Leu Thr His Val Arg Tyr Ala Ala Ser Asn Glu Glu Tyr 
145 150 155 160 



Arg Ala lie Phe Arg Gly Val Gin Leu Glu Phe Ser Pro Gin Gly Phe 
165 170 175 



r 
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Arg Ala Val Ala Ser Asp Gly Tyr Arg Leu Ala Leu Tyr Asp Leu Pro 
180 185 . 190 

Leu Pro Gin Gly Phe Gin Ala Lys Ala Val Val Pro Ala Arg Ser Val 
195 200 205 

Asp Glu Met Val Arg Val Leu Lys Gly Ala Asp Gly Ala Glu Ala Val 
210 215 220 

Leu Ala Leu Gly Glu Gly Val Leu Ala Leu Ala Leu Glu Gly Gly Ser 
225 230 235 240 

Gly Val Arg Met Ala Leu Arg Leu Met Glu Gly Glu Phe Pro Asp Tyr 
245 250 255 

Gin Arg Val He Pro Gin Glu Phe Ala Leu Lys Val Gin Val Glu Gly 
260 265 270 

Glu Ala Leu Arg Glu Ala Val Arg Arg Val Ser Val Leu Ser Asp Arg 
275 280 285 

Gin Asn His Arg Val Asp Leu Leu Leu Glu Glu Gly Arg He Leu Leu 
290 295 . 300 

Ser Ala Glu Gly Asp Tyr Gly Lys Gly Gin Glu Glu Val Pro Ala Gin 
305 310 315 320 

Val Glu Gly Pro Asp Met Ala Val Ala Tyr Asn Ala Arg Tyr Leu Leu 
325 330 335 

Glu Ala Leu Ala Pro Val Gly Asp Arg Ala His Leu Gly He Ser Gly 
340 345 350 

Pro Thr Ser Pro Ser Leu He Trp Gly Asp Gly Glu Gly Tyr Arg Ala 
355 360 365 

Val Val Val Pro Leu Arg Val Glx 
370 375 

INFORMATION FOR SSQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Asn He Thr Val Pro Lys Lys Leu Leu Ser Asp Gin Leu Ser Leu 

15 10 15 

Leu Glu Arg He Val Pro Ser Arg Ser Ala Asn Pro Leu Tyr Thr Tyr 
20 25 30 

Leu Gly Leu Tyr Ala Glu Glu Gly Ala Leu He Leu Phe Gly Thr Asn 

35 40 45 
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Gly Glu Val Asp Leu 
50 

Pro Arg Val Leu Val 
65 

Leu Pro Gly Asp Leu 
85 

Gly Gly Gin Leu Glu 
100 

Leu Ala Pro Ala Glu 
115 

Asp Lys Gly Ala Phe 
130 

Val Lys Ala Leu Thr 
145 

Arg Ala lie Phe Arg 
165 

Arg Ala Val Ala Ser 
180 

Leu Pro Gin Gly Phe 
195 

Asp Glu Met Val Arg 
210 

Leu Ala Leu Gly Glu 
225 

Gly Val Arg Met Ala 
245 

Gin Arg Val lie Pro 
260 

Glu Ala Leu Arg Glu 
275 

Gin Asn His Arg Val 
290 

Ser Ala Glu Gly Asp 
305 

Val Glu Gly Pro Asp 
325 

Glu Ala Leu Ala Pro 
340 

Pro Thr Ser Pro Ser 
355 

Val Val Val Pro Leu 
370 



Glu Val Arg Leu Pro Ala 
55 

Pro Ala Gin Pro Phe Phe 
70 75 

Val Ala Leu Gly Leu Ala 
90 

Leu Ser Ser Gly Arg Phe 
105 

Gly Tyr Pro Glu Leu Leu 
120 

Pro Leu Arg Thr Arg Met 
135 

His Val Arg Tyr Ala Ala 
150 155 

Gly Val Gin Leu Glu Phe 
170 

Asp Gly Tyr Arg Leu Ala 
185 

Gin Ala Lys Ala Val Val 
200 

Val Leu Lys Gly Ala Asp 
215 

Gly Val Leu Ala Leu Ala 
230 235 

Leu Arg Leu Met Glu Gly 
250 

Gin Glu Phe Ala Leu Lys 
265 

Ala Val Arg Arg Val Ser 
280 

Asp Leu Leu Leu Glu Glu 
295 

Tyr Gly Lys Gly Gin Glu 
310 315 

Met Ala Val Ala Tyr Asn 
330 

Val Gly Asp Arg Ala His 
345 

Leu lie Trp Gly Asp Gly 
360 

Arg Val Glx 
375 



Glu Ala Gin Ser Leu 
60 

Gin Leu Val Arg Ser 
80 

Ser Glu Pro Gly Gin 
95 

Arg Thr Arg Leu Ser 
110 

Val Pro Glu Gly Glu 
125 

Pro Ser Gly Glu Leu 
140 

Ser Asn Glu Glu Tyr 
160 

Ser Pro Gin Gly Phe 
175 

Leu Tyr Asp Leu Pro 
190 

Pro Ala Arg Ser Val 
205 

Gly Ala Glu Ala Val 
220 

Leu Glu Gly Gly Ser 
240 

Glu Phe Pro Asp Tyr 
255 

Val Gin Val Glu Gly 
270 

Val Leu Ser Asp .Arg 
285 

Gly Arg lie Leu Leu 
300 

Glu Val Pro Ala Gin 
320 

Ala Arg Tyr Leu Leu 
335 

Leu Gly lie Ser Gly 
350 

Glu Gly Tyr Arg Ala 
365 
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(2) INFORMATION FOR SEQ ID NO: 109: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Lys Phe Thr Val Glu Arg Glu His Leu Leu Lys Pro Leu Gin Gin 
1 5 10 15 

Val Ser Gly Pro Leu Gly Gly Arg Pro Thr Leu Pro He Leu Gly Asn 
20 25 30 

Leu Leu Leu Gin Val Ala Asp Gly Thr Leu Ser Leu Thr Gly Thr Asp 
35 40 m 45 

Leu Glu Met Glu Met Val Ala Arg Val Ala Leu Val Gin Pro His Glu 
50 55 60 

Pro Gly Ala Thr Thr Val Pro Ala Arg Lys Phe Phe Asp He Cys Arg 
65 70 75 80 

Gly Leu Pro Glu Gly Ala Glu He Ala Val Gin Leu Glu Gly Glu Arg 
85 90 95 

Met Leu Val Arg Ser Gly Arg Ser Arg Phe Ser Leu Ser Thr Leu Pro 
100 105 HO 

Ala Ala Asp Phe Pro Asn Leu Asp Asp Trp Gin Ser Glu Val Glu Phe 
115 120 125 

Thr Leu Pro Gin Ala Thr Met Lys Arg Leu He Glu Ala Thr Gin Phe 
130 135 140 

Ser Met Ala His Gin Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Phe 
145 150 155 160 

Glu Thr Glu Gly Glu Glu Leu Arg Thr Val Ala Thr Asp Gly His Arg 
165 170 175 

Leu Ala Val Cys Ser Met Pro He Gly Gin Ser Leu Pro Ser His Ser 
180 185 190 

Val He Val Pro Arg Lys Gly Val He Glu Leu Met Arg Met Leu Asp 
195 200 205 

Gly Gly Asp Asn Pro Leu Arg Val Gin He Gly Ser Asn Asn He Arg 
210 215 220 

Ala His Val Gly Asp Phe He Phe Thr Ser Lys Leu Val Asp Gly Arg 
225 230 235 240 

Phe Pro Asp Tyr Arg Arg Val Leu Pro Lys Asn Pro Asp Lys His Leu 
245 250 255 
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Glu Ala Gly Cys Asp Leu Leu Lys Gin Ala Phe Ala Arg Ala Ala He 
260 265 270 

Leu Ser Asn Glu Lys Phe Arg Gly Val Arg Leu Tyr Val Ser Glu Asn 
275 280 285 

Gin Leu Lys He Thr Ala Asn Asn Pro Glu Gin Glu Glu Ala Glu Glu 
290 295 300 

He Leu Asp Val Thr Tyr Ser Gly Ala Glu Met Glu He Gly Phe Asn 
305 310 315 320 

Val Ser Tyr Val Leu Asp Val Leu Asn Ala Leu Lys Cys Glu Asn Val 
325 330 335 

Arg Met Met Leu Thr Asp Ser Val Ser Ser Val Gin He Glu Asp Ala 
340 345 350 

Ala Ser Gin Ser Ala Ala Tyr Val Val Met Pro Met Arg Leu Glx 
355 360 365 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE protein 




(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Lys Phe He He Glu Arg Glu Gin Leu Leu Lys Pro Leu Gin Gin 
15 10 15 

Val Ser Gly Pro Leu Gly Gly Arg Pro Thr Leu Pro He Leu Gly Asn 
20 25 30 

Leu Leu Leu Lys Val Thr Glu Asn Thr Leu Ser Leu Thr Gly Thr Asp 
35 40 45 

Leu Glu Met Glu Met Met Ala Arg Val Ser Leu Ser Gin Ser His Glu 
50 55 60 



He Gly Ala Thr Thr Val Pro Ala 
65 70 

Gly Leu Pro Glu Gly Ala Glu He 
85 

Leu Leu Val Arg Ser Gly Arg Ser 
100 

Ala Ser Asp Phe Pro Asn Leu Asp 

115 120 

Thr Leu Pro Gin Ala Thr Leu Lys 
130 135 



Arg Lys Phe Phe Asp He Trp Arg 
75 80 

Ser Val Glu Leu Asp Gly Asp Arg 
90 95 

Arg Phe Ser Leu Ser Thr Leu Pro 
105 HO 

Asp Trp Gin Ser Glu Val Glu Phe 
125 

Arg Leu He Glu Ser Thr Gin Phe 
140 
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Ser Met Ala His Gin Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Phe 
145 150 _ 155 160 

Glu Thr Glu Asn Thr Glu Leu Arg Thr Val Ala Thr Asp Gly His Arg 
165 170 175 

Leu Ala Val Cys Ala Met Asp lie Gly Gin Ser Leu Pro Gly His Ser 
180 185 190 

Val lie Val Pro Arg Lys Gly Val lie Glu Leu Met Arg Leu Leu Asp 
195 200 205 

Gly Ser Gly Glu Ser Leu Leu Gin Leu Gin He Gly Ser Asn Asn Leu 
210 215 220 

Arg Ala His Val Gly Asp Phe He Phe Thr Ser Lys Leu Val Asp Gly 
225 230 235 240 

Arg Phe Pro Asp Tyr Arg Arg Val Leu Pro Lys Asn Pro Thr Lys Thr 
245 250 255 

Val He Ala Gly Cys Asp He Leu Lys Gin Ala Phe Ser Arg Ala Ala 
260 265 270 

He Leu Ser Asn Glu Lys Phe Arg Gly Val Arg He Asn Leu Thr Asn 
275 280 285 

Gly Gin Leu Lys He Thr Ala Asn Asn Pro Glu Gin Glu Glu Ala Glu 
290 295 300 

Glu He Val Asp Val Gin Tyr Gin Gly Glu Glu Met Glu He Gly Phe 
305 310 315 320 

Asn Val Ser Tyr Leu Leu Asp Val Leu Asn Thr Leu Lys Cys Glu Glu 
325 330 335 

Val Lys Leu Leu Leu Thr Asp Ala Val Ser Ser Val Gin Val Glu Asn 
340 345 350 

Val Ala Ser Ala Ala Ala Ala Tyr Val Val Met Pro Met Arg Leu 
355 360 365 

INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Met Gin Phe Ser He Ser Arg Glu Asn Leu Leu Lys Pro Leu Gin Gin 
15 10 15 

Val Cys Gly Val Leu Ser Asn Arg Pro Asn He Pro Val Leu Asn Asn 
20 25 30 
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Val Leu Leu Gin lie Glu Asp Tyr Arg Leu Thr He Thr Gly Thr Asp 
35 40 45 

Leu Glu Val Glu Leu Ser Ser Gin Thr Gin Leu Ser Ser Ser Ser Glu 
50 55 60 

Asn Gly Thr Phe Thr He Pro Ala Lys Lys Phe Leu Asp He Cys Arg 
65 70 75 80 

Thr Leu Ser Asp Asp Ser Glu He Thr Val Thr Phe Glu Gin Asp Arg 
85 90 95 

Ala Leu Val Gin Ser Gly Arg Ser Arg Phe Thr Leu Ala Thr Gin Pro 
100 105 HO 

Ala Glu Glu Tyr Pro Asn Leu Thr Asp Trp Gin Ser Glu Val Asp Phe 
115 120 125 

Glu Leu Pro Gin Asn Thr Leu Arg Arg Leu He Glu Ala Thr Gin Phe 
130 135 140 

Ser Met Ala Asn Gin Asp Ala Arg Tyr Phe Leu Asn Gly Met Lys Phe 
145 150 . 155 160 

Glu Thr Glu Gly Asn Leu Leu Arg Thr Val Ala Thr Asp Gly His Arg 
165 170 175 

Leu Ala Val Cys Thr He Ser Leu Glu Gin Glu Leu Gin Asn His Ser 
180 185 190 

Val He Leu Pro Arg Lys Gly Val Leu Glu Leu Val Arg Leu Leu Glu 
195 200 205 

Thr Asn Asp Glu Pro Ala Arg Leu Gin He Gly Thr Asn Asn Leu Arg 
210 215 220 

Val His Leu Lys Asn Thr Val Phe Thr Ser Lys Leu He Asp Gly Arg 
225 230 235 240 

Phe Pro Asp Tyr Arg Arg Val Leu Pro Arg Asn Ala Thr Lys lie Val 
245 250 255 

Glu Gly Asn Trp Glu Met Leu Lys Gin Ala Phe Ala Arg Ala Ser He 
260 265 270 

Leu Ser Asn Glu Arg Ala Arg Ser Val Arg Leu Ser Leu Lys Glu Asn 
275 280 285 

Gin Leu Lys He Thr Ala Ser Asn Thr Glu His Glu Glu Ala Glu Glu 
290 295 300 

He Val Asp Val Asn Tyr Asn Gly Glu Glu Leu Glu Val Gly Phe Asn 
305 310 315 320 

Val Thr Tyr He Leu Asp Val Leu Asn Ala Leu Lys Cys Asn Gin Val 
325 330 335 

Arg Met Cys Leu Thr Asp Ala Phe Ser Ser Cys Leu He Glu Asn Cys 
340 345 350 

Glu Asp Ser Ser Cys Glu Tyr Val He Met Pro Met Arg Leu 
355 360 365 
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(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Met His Phe Thr lie Gin Arg Glu Ala Leu Leu Lys Pro Leu Gin Leu 
15 10 15 

Val Ala Gly Val Val Glu Arg Arg Gin Thr Leu Pro Val Leu Ser Asn 
20 25 30 

Val Leu Leu Val Val Gin Gly Gin Gin Leu Ser Leu Thr Gly Thr Asp 
35 40 45 

Leu Glu Val Glu Leu Val Gly Arg Val Gin Leu Glu Glu Pro Ala Glu 
50 55 60 

Pro Gly Glu lie Thr Val Pro Ala Arg Lys Leu Met Asp lie Cys Lys 
65 70 75 80 

Ser Leu Pro Asn Asp Ala Leu lie Asp lie Lys Val Asp Glu Gin Lys 
85 90 95 

Leu Leu Val Lys Ala Gly Arg Ser Arg Phe Thr Leu Ser Thr Leu Pro 
100 105 110 

Ala Asn Asp Phe Pro Thr Val Glu Glu Gly Pro Gly Ser Leu Thr Cys 
115 120 125 

Asn Leu Glu Gin Ser Lys Leu Arg Arg Leu lie Glu Arg Thr Ser Phe 
130 135 140 

Ala Met Ala Gin Gin Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Leu ■ 
145 150 155 160 

Glu Val Ser Arg Asn Thr Leu Arg Ala Val Ser Thr Asp Gly His Arg 
165 170 175 

Leu Ala Leu Cys Ser Met Ser Ala Pro lie Glu Gin Glu Asp Arg His 
180 185 190 

Gin Val lie Val Pro Arg Lys Gly lie Leu Glu Leu Ala Arg Leu Leu 
195 200 205 

Thr Asp Pro Glu Gly Met Val Ser He Val Leu Gly Gin His His He 
210 215 220 

Arg Ala Thr Thr Gly Glu Phe Thr Phe Thr Ser Lys Leu Val Asp Gly 
225 230 235 240 

Lys Phe Pro Asp Tyr Glu Arg Val Leu Pro Lys Gly Gly Asp Lys Leu 
245 250 255 
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Val Val Gly Asp Arg Gin Ala Leu Arg Glu Ala Phe Ser Arg Thr Ala 
260 265 270 

lie Leu Ser Asn Glu Lys Tyr Arg Gly lie Arg Leu Gin Leu Ala Ala 
275 280 285 

Gly Gin Leu Lys He Gin Ala Asn Asn Pro Glu Gin Glu Glu Ala Glu 
290 295 300 

Glu Glu He Ser Val Asp Tyr Glu Gly Ser Ser Leu Glu He Gly Phe 
305 310 315 320 

Asn Val Ser Tyr Leu Leu Asp Val Leu Gly Val Met Thr Thr Glu Gin 
325 330 335 

Val Arg Leu He Leu Ser Asp Ser Asn Ser Ser Ala Leu Leu Gin Glu 
340 345 350 

Ala Gly Asn Asp Asp Ser Ser Tyr Val Val Met Pro Met Arg Leu 
355 360 365 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ' protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Met Lys Phe Thr He Gin Asn Asp He Leu Thr Lys Asn Leu Lys Lys 

He Thr Arg Val Leu Val Lys Asn He Ser Phe Pro He Leu Glu Asn 
20 25 30 

He Leu He Gin Val Glu Asp Gly Thr Leu Ser Leu Thr Thr Thr Asn 
35 40 45 

Leu Glu He Glu Leu He Ser Lys He Glu He He Thr Lys Tyr He 
50 55 60 

Pro Gly Lys Thr Thr He Ser Gly Arg Lys He Leu Asn He Cys Arg 
65 70 75 80 

Thr Leu Ser Glu Lys Ser Lys He Lys Met Gin Leu Lys Asn Lys Lys 
85 90 95 

Met Tyr He Ser Ser Glu Asn Ser Asn Tyr He Leu Ser Thr Leu Ser 
100 105 no 

Ala Asp Thr Phe Pro Asn His Gin Asn Phe Asp Tyr He Ser Lys Phe 
115 120 125 

Asp He Ser Ser Asn He Leu Lys Glu Met He Glu Lys Thr Glu Phe 
130 135 140 
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Ser Met Gly Lys Gin Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Leu 
145 150 155 1^0 

Glu Lys Lys Asp Lys Phe Leu Arg Ser Val Ala Thr Asp Gly Tyr Arg 
165 170 175 

Leu Ala lie Ser Tyr Thr Gin Leu Lys Lys Asp lie Asn Phe Phe Ser 
180 185 190 

lie lie lie Pro Asn Lys Ala Val Met Glu Leu Leu Lys Leu Leu Asn 
195 200 205 

Thr Gin Pro Gin Leu Leu Asn lie Leu lie Gly Ser Asn Ser lie Arg 
210 215 220 

lie Tyr Thr Lys Asn Leu He Phe Thr Thr Gin Leu He Glu Gly Glu 
225 230 235 240 

Tyr Pro Asp Tyr Lys Ser Val Leu Phe Lys Glu Lys Lys Asn Pro He 
245 250 255 

He Thr Asn Ser He Leu Leu Lys Lys Ser Leu Leu Arg Val Ala He 
260 265 270 

Leu Ala His Glu Lys Phe Cys Gly He Glu He Lys He Glu Asn Gly 
275 280 285 

Lys Phe Lys Val Leu Ser Asp Asn Gin Glu Glu Glu Thr Ala Glu Asp 
290 295 300 

Leu Phe Glu He Asp Tyr Phe Gly Glu Lys He Glu He Ser He Asn 
305 310 315 320 

Val Tyr Tyr Leu Leu Asp Val He Asn Asn He Lys Ser Glu Asn He 
325 330 335 

Ala Leu Phe Leu Asn Lys Ser Lys Ser Ser He Gin He Glu Ala Glu 
340 345 350 

Asn Asn Ser Ser Asn Ala Tyr Val Val Met Leu Leu Lys Arg 
355 360 365 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = " PRIMER " 

(iii) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
GTGTGGATCC TCGTCCCCCT CATGCGCGAC CAGGAAGGG 
(2) INFORMATION FOR SEQ ID NO: 115: 



39 
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<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "PRIMER" 

( i i i ) HYPOTHETICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 
GTGTGGATCC GTGGTGACCT TAGCCAC 
(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTIpN: /desc = "PRIMER" 

{ i i i ) H YPOTHET ICAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 
TTCGTGTCCG AGGACCTTGT GGTCCACAAC 
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